weatherDB

station

This module has a class for every type of station. E.g. StationN (or StationN). One object represents one Station with one parameter. This object can get used to get the corresponding timeserie. There is also a StationGroup class that groups the three parameters precipitation, temperature and evapotranspiration together for one station.

StationN

class weatherDB.station.StationN(id, **kwargs)

Bases: weatherDB.station.StationNBase

A class to work with and download 10 minutes precipitation data for one station.

__init__(id, **kwargs)

Create a Station object.

Parameters
  • id (int) – The stations ID.

  • _skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False

Raises

NotImplementedError – _description_

update_horizon(skip_if_exist=True)

Update the horizon angle (Horizontabschirmung) in the meta table.

Get new values from the raster and put in the table.

Parameters

skip_if_exist (bool, optional) – Skip updating the value if there is already a value in the meta table. The default is True.

Returns

The horizon angle in degrees (Horizontabschirmung).

Return type

float

update_richter_class(skip_if_exist=True)

Update the richter class in the meta table.

Get new values from the raster and put in the table.

Parameters

skip_if_exist (bool, optional) – Skip updating the value if there is already a value in the meta table. The default is True

Returns

The richter class name.

Return type

str

richter_correct(period=(None, None), **kwargs)

Do the richter correction on the filled data for the given period.

Parameters

period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).

Raises

Exception – If no richter class was found for this station.

corr(period=(None, None))
last_imp_richter_correct(_last_imp_period=None)

Do the richter correction of the last import.

Parameters

_last_imp_period (_type_, optional) – Give the overall period of the last import. This is only for intern use of the stationsN methode to not compute over and over again the period. The default is None.

last_imp_corr(_last_imp_period=None)

A wrapper for last_imp_richter_correct().

fillup(period=(None, None), **kwargs)

Fill up missing data with measurements from nearby stations.

get_corr(period=(None, None))
get_qn(period=(None, None))
get_richter_class(update_if_fails=True)

Get the richter class for this station.

Provide the data from the meta table.

Parameters

update_if_fails (bool, optional) – Should the richter class get updatet if no exposition class is found in the meta table? If False and no exposition class was found None is returned. The default is True.

Returns

The corresponding richter exposition class.

Return type

string

get_horizon()

Get the value for the horizon angle. (Horizontabschirmung)

This value is defined by Richter (1995) as the mean horizon angle in the west direction as: H’=0,15H(S-SW) +0,35H(SW-W) +0,35H(W-NW) +0, 15H(NW-N)

Returns

The mean western horizon angle

Return type

float or None

StationT

class weatherDB.station.StationT(id, **kwargs)

Bases: weatherDB.station.StationTETBase

A class to work with and download temperaure data for one station.

__init__(id, **kwargs)

Create a Station object.

Parameters
  • id (int) – The stations ID.

  • _skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False

Raises

NotImplementedError – _description_

get_multi_annual()

Get the multi annual value(s) for this station.

Returns

The corresponding multi annual value. For T en ET the yearly value is returned. For N the winter and summer half yearly sum is returned in tuple.

Return type

list or number

get_adj(period=(None, None))

Get the adjusted timeserie.

The timeserie is adjusted to the multi annual mean. So the overall mean of the given period will be the same as the multi annual mean.

Parameters
  • period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).

  • agg_to (str or None, optional) – Aggregate to a given timespan. Can be anything smaller than the maximum timespan of the saved data. If a Timeperiod smaller than the saved data is given, than the maximum possible timeperiod is returned. For T and ET it can be “month”, “year”. For N it can also be “hour”. If None than the maximum timeperiod is taken. The default is None.

Returns

A timeserie with the adjusted data.

Return type

pandas.DataFrame

StationET

class weatherDB.station.StationET(id, **kwargs)

Bases: weatherDB.station.StationTETBase

A class to work with and download potential Evapotranspiration (VPGB) data for one station.

__init__(id, **kwargs)

Create a Station object.

Parameters
  • id (int) – The stations ID.

  • _skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False

Raises

NotImplementedError – _description_

get_adj(period=(None, None))

Get the adjusted timeserie.

The timeserie is adjusted to the multi annual mean. So the overall mean of the given period will be the same as the multi annual mean.

Parameters
  • period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).

  • agg_to (str or None, optional) – Aggregate to a given timespan. Can be anything smaller than the maximum timespan of the saved data. If a Timeperiod smaller than the saved data is given, than the maximum possible timeperiod is returned. For T and ET it can be “month”, “year”. For N it can also be “hour”. If None than the maximum timeperiod is taken. The default is None.

Returns

A timeserie with the adjusted data.

Return type

pandas.DataFrame

StationND

class weatherDB.station.StationND(id, **kwargs)

Bases: weatherDB.station.StationNBase, weatherDB.station.StationCanVirtualBase

A class to work with and download daily precipitation data for one station.

Those station data are only downloaded to do some quality checks on the 10 minute data. Therefor there is no special quality check and richter correction done on this data. If you want daily precipitation data, better use the 10 minutes station(StationN) and aggregate to daily values.

property quality_check

(!) Disallowed inherited

property last_imp_quality_check

(!) Disallowed inherited

property get_corr

(!) Disallowed inherited

property get_adj

(!) Disallowed inherited

property get_qc

(!) Disallowed inherited

__init__(id, **kwargs)

Create a Station object.

Parameters
  • id (int) – The stations ID.

  • _skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False

Raises

NotImplementedError – _description_

GroupStation

class weatherDB.station.GroupStation(id, error_if_missing=True, **kwargs)

Bases: object

A class to group all possible parameters of one station.

So if you want to create the input files for a simulation, where you need T, ET and N, use this class to download the data for one station.

__init__(id, error_if_missing=True, **kwargs)
get_available_paras(short=False)

Get the possible parameters for this station.

Parameters

short (bool, optional) – Should the short name of the parameters be returned. The default is “long”.

Returns

A list of the long parameter names that are possible for this station to get.

Return type

list of str

get_filled_period(kind='best', from_meta=True)

Get the combined filled period for all 3 stations.

This is the maximum possible timerange for these stations.

Parameters
  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • from_meta (bool, optional) – Should the period be from the meta table? If False: the period is returned from the timeserie. In this case this function is only a wrapper for .get_period_meta. The default is True.

Returns

The maximum filled period for the 3 parameters for this station.

Return type

TimestampPeriod

get_df(period=(None, None), kind='best', paras='all', agg_to='day')

Get a DataFrame with the corresponding data.

Parameters
  • period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).

  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.

Returns

A DataFrame with the timeseries for this station and the given period.

Return type

pd.Dataframe

classmethod get_meta_explanation(infos='all')

Get the explanations of the available meta fields.

Parameters

infos (list or string, optional) – The infos you wish to get an explanation for. If “all” then all the available information get returned. The default is “all”

Returns

a pandas Series with the information names as index and the explanation as values.

Return type

pd.Series

get_meta(paras='all', **kwargs)

Get the meta information for every parameter of this station.

Parameters
  • paras (list of str or str, optional) – Give the parameters for which to get the meta information. Can be “n”, “t”, “et” or “all”. If “all”, then every available station parameter is returned. The default is “all”

  • kwargs (dict, optional) – The optional keyword arguments are handed to the single Station get_meta methodes. Can be e.g. “info”.

Returns

dict with the information. there is one subdict per parameter. If only one parameter is asked for, then there is no subdict, but only a single value.

Return type

dict

get_geom()
get_name()
create_roger_ts(dir, period=(None, None), kind='best', r_r0=1)

Create the timeserie files for roger as csv.

This is only a wrapper function for create_ts with some standard settings.

Parameters
  • dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the statiopns ID is added to the filepath.

  • period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)

  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • r_r0 (int or float, list of int or float or None, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int or float, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is 1.

Raises

Warning – If there are NAs in the timeseries or the period got changed.

create_ts(dir, period=(None, None), kind='best', agg_to='10 min', r_r0=None, split_date=False)

Create the timeserie files as csv.

Parameters
  • dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the statiopns ID is added to the filepath.

  • period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)

  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.

  • r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is None.

  • split_date (bool, optional) – Should the timestamp get splitted into parts, so one column for year, one for month etc.? If False the timestamp is saved in one column as string.

Raises

Warning – If there are NAs in the timeseries or the period got changed.

stations

This module has grouping classes for all the stations of one parameter. E.G. StationsN (or StationsN) groups all the Precipitation Stations available. Those classes can get used to do actions on all the stations.

StationsN

class weatherDB.stations.StationsN

Bases: weatherDB.stations.StationsBase

A class to work with and download 10 minutes precipitation data for several stations.

update_richter_class(stids='all')

Update the Richter exposition class.

Get the value from the raster, compare with the richter categories and save to the database.

Parameters

stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

Raises

ValueError – If the given stids (Station_IDs) are not all valid.

richter_correct(stids='all')

Richter correct the filled data.

Parameters

stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

Raises

ValueError – If the given stids (Station_IDs) are not all valid.

last_imp_corr(stids='all')

Richter correct the filled data for the last imported period.

Parameters

stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

Raises

ValueError – If the given stids (Station_IDs) are not all valid.

StationsT

class weatherDB.stations.StationsT

Bases: weatherDB.stations.StationsTETBase

A class to work with and download temperature data for several stations.

StationsET

class weatherDB.stations.StationsET

Bases: weatherDB.stations.StationsTETBase

A class to work with and download potential Evapotranspiration (VPGB) data for several stations.

StationsND

class weatherDB.stations.StationsND

Bases: weatherDB.stations.StationsBase

A class to work with and download daily precipitation data for several stations.

Those stations data are only downloaded to do some quality checks on the 10 minutes data. Therefor there is no special quality check and richter correction done on this data. If you want daily precipitation data, better use the 10 minutes station class (StationN) and aggregate to daily values.

GroupStations

class weatherDB.stations.GroupStations

Bases: object

A class to group all possible parameters of all the stations.

__init__()
get_valid_stids()
classmethod get_meta_explanation(infos='all')

Get the explanations of the available meta fields.

Parameters

infos (list or string, optional) – The infos you wish to get an explanation for. If “all” then all the available information get returned. The default is “all”

Returns

a pandas Series with the information names as index and the explanation as values.

Return type

pd.Series

get_meta(paras='all', stids='all', **kwargs)

Get the meta Dataframe from the Database.

Parameters
  • paras (list or str, optional) – The parameters for which to get the information. If “all” then all the available parameters are requested. The default is “all”.

  • stids (string or list of int, optional) – The Stations to return the meta information for. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

  • **kwargs (dict, optional) – The keyword arguments are passed to the station.GroupStation().get_meta methode. From there it is passed to the single station get_meta methode. Can be e.g. “infos”

Returns

  • dict of pandas.DataFrame or geopandas.GeoDataFrame

  • or pandas.DataFrame or geopandas.GeoDataFrame – The meta DataFrame. If several parameters are asked for, then a dict with an entry per parameter is returned.

Raises
  • ValueError – If the given stids (Station_IDs) are not all valid.

  • ValueError – If the given paras are not all valid.

get_para_stations(paras='all')

Get a list with all the multi parameter stations as stations.Station{parameter}-objects.

Parameters

paras (list or str, optional) – The parameters for which to get the objects. If “all” then all the available parameters are requested. The default is “all”.

Returns

returns a list with the corresponding station objects.

Return type

Station-object

Raises

ValueError – If the given stids (Station_IDs) are not all valid.

get_group_stations(stids='all', **kwargs)

Get a list with all the stations as station.GroupStation-objects.

Parameters
  • stids (string or list of int, optional) – The Stations to return. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

  • **kwargs (optional) – The keyword arguments are handed to the creation of the single GroupStation objects. Can be e.g. “error_if_missing”.

Returns

returns a list with the corresponding station objects.

Return type

Station-object

Raises

ValueError – If the given stids (Station_IDs) are not all valid.

create_ts(dir, period=(None, None), kind='best', stids='all', agg_to='10 min', r_r0=None, split_date=False)

Download and create the weather tables as csv files.

Parameters
  • dir (path-like object) – The directory where to save the tables. If the directory is a ZipFile, then the output will get zipped into this.

  • period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)

  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

  • agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.

  • r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is None.

  • split_date (bool, optional) – Should the timestamp get splitted into parts, so one column for year, one for month etc.? If False the timestamp is saved in one column as string.

create_roger_ts(dir, period=(None, None), stids='all', kind='best', r_r0=1)

Create the timeserie files for roger as csv.

This is only a wrapper function for create_ts with some standard settings.

Parameters
  • dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the stations ID is added to the filepath.

  • period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)

  • stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.

  • kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.

  • r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R_R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is 1.

Raises

Warning – If there are NAs in the timeseries or the period got changed.

broker

This submodule has only one class Broker. This one is used to do actions on all the stations together. Mainly only used for updating the DB.

Broker

class weatherDB.broker.Broker

Bases: object

A class to manage and update the database.

Can get used to update all the stations and parameters at once.

This class is only working with super user privileges.

__init__()
update_raw(only_new=True, paras=['n_d', 'n', 't', 'et'])

Update the raw data from the DWD-CDC server to the database.

Parameters
  • only_new (bool, optional) – Get only the files that are not yet in the database? If False all the available files are loaded again. The default is True.

  • paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

update_meta(paras=['n_d', 'n', 't', 'et'])

Update the meta file from the CDC Server to the Database.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

update_ma(paras=['n_d', 'n', 't', 'et'])

Update the multi-annual data from raster to table.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

update_period_meta(paras=['n_d', 'n', 't', 'et'])

Update the periods in the meta table.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

quality_check(paras=['n', 't', 'et'], with_fillup_nd=True)

Do the quality check on the stations raw data.

Parameters
  • paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n”, “t”, “et”]. The default is [“n”, “t”, “et”].

  • with_fillup_nd (bool, optional) – Should the daily precipitation data get filled up if the 10 minute precipitation data gets quality checked. The default is True.

last_imp_quality_check(paras=['n', 't', 'et'], with_fillup_nd=True)

Quality check the last imported data.

Also fills up the daily precipitation data if the 10 minute precipitation data should get quality checked.

Parameters
  • paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n”, “t”, “et”]. The default is [“n”, “t”, “et”].

  • with_fillup_nd (bool, optional) – Should the daily precipitation data get filled up if the 10 minute precipitation data gets quality checked. The default is True.

fillup(paras=['n', 't', 'et'])

Fillup the timeseries.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

last_imp_fillup(paras=['n', 't', 'et'])

Fillup the last imported data.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

richter_correct()

Richter correct all of the precipitation data.

last_imp_corr()

Richter correct the last imported precipitation data.

update_db(paras=['n_d', 'n', 't', 'et'])

The regular Update of the database.

Downloads new data. Quality checks the newly imported data. Fills up the newly imported data.

Parameters

paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].

initiate_db()

Initiate the Database.

Downloads all the data from the CDC server for the first time. Updates the multi-annual data and the richter-class for all the stations. Quality checks and fills up the timeseries.

Subpackages