utils

In this module there are several utilities used by the weatherDB package, you shouldn’t need to use them directly.

class weatherDB.utils.TimestampPeriod(start, end, tzinfo='UTC')[source]

Bases: object

A class to save a Timespan with a minimal and maximal Timestamp.

Initiate a TimestampPeriod.

Parameters:
  • start (pd.Timestamp or similar) – The start of the Period.

  • end (pd.Timestamp or similar) – The end of the Period.

  • tzinfo (str or datetime.timezone object or None, optional) – The timezone to set to the timestamps. If the timestamps already have a timezone they will get converted. If None, then the timezone is not changed or set. The default is “UTC”.

union(other, how='inner')[source]

Unite 2 TimestampPeriods to one.

Compares the Periods and computes a new one.

Parameters:
  • other (TimestampPeriod) – The other TimestampPeriod with whome to compare.

  • how (str, optional) – How to compare the 2 TimestampPeriods. Can be “inner” or “outer”. “inner”: the maximal Timespan for both is computed. “outer”: The minimal Timespan for both is computed. The default is “inner”.

Returns:

A new TimespanPeriod object uniting both TimestampPeriods.

Return type:

TimestampPeriod

get_period()[source]
has_NaT()[source]

Has the TimestampPeriod at least one NaT.

This means that the start or end is not given. Normally this should never happen, because it makes no sense.

Returns:

True if the TimestampPeriod has at least on NaT. False if the TimestampPeriod has at least a start or a end.

Return type:

bool

has_only_NaT()[source]

Has the TimestampPeriod only NaT, meaning is empty.

This means that the start and end is not given.

Returns:

True if the TimestampPeriod is empty. False if the TimestampPeriod has a start and an end.

Return type:

bool

is_empty()[source]

Is the TimestampPeriod empty.

This means that the start and end is not given.

Returns:

True if the TimestampPeriod is empty. False if the TimestampPeriod has a start and an end.

Return type:

bool

strftime(format='%Y-%m-%d %H:%M:%S')[source]

Convert the TimestampPeriod to a list of strings.

Formates the Timestamp as a string.

Parameters:

format (str, optional) – The Timestamp-format to use. The Default is “%Y-%m-%d %H:%M:%S”

Returns:

A list of the start and end of the TimestampPeriod as formated string.

Return type:

list of 2 strings

inside(other)[source]

Is the TimestampPeriod inside another TimestampPeriod?

Parameters:

other (Timestampperiod or tuple of 2 Timestamp or Timestamp strings) – The other Timestamp to test against. Test if this TimestampPeriod is inside the other.

Returns:

True if this TimestampPeriod is inside the other. Meaning that the start is higher or equal than the others starts and the end is smaller than the others end.

Return type:

bool

contains(other)[source]

Does this TimestampPeriod contain another TimestampPeriod?

Parameters:

other (Timestampperiod or tuple of 2 Timestamp or Timestamp strings) – The other Timestamp to test against. Test if this TimestampPeriod contains the other.

Returns:

True if this TimestampPeriod contains the other. Meaning that the start is smaller or equal than the others starts and the end is higher than the others end.

Return type:

bool

get_sql_format_dict(format="'%Y%m%d %H:%M'")[source]

Get the dictionary to use in sql queries.

Parameters:

format (str, optional) – The Timestamp-format to use. The Default is “’%Y%m%d %H:%M’”

Returns:

a dictionary with 2 keys (min_tstp, max_tstp) and the corresponding Timestamp as formated string.

Return type:

dict

get_interval()[source]

Get the interval of the TimestampPeriod.

Returns:

The interval of this TimestampPeriod. E.G. Timedelta(2 days 12:30:12)

Return type:

pd.Timedelta

get_middle()[source]

Get the middle Timestamp of the TimestampPeriod.

Returns:

The middle Timestamp of this TimestampPeriod.

Return type:

Timestamp

copy()[source]

Copy this TimestampPeriod.

Returns:

a new TimestampPeriod object that is equal to this one.

Return type:

TimestampPeriod

expand_to_timestamp()[source]
set_tz(tzinfo)[source]

Set the TimestampPeriod to a new timezone.

Parameters:

tzinfo (str or datetime.timezone object) – The timezone to set the TimestampPeriod to.

Returns:

This TimestampPeriod.

Return type:

TimestampPeriod

dwd

Some utilities functions to get data from the DWD-CDC server.

Based on max_fun package on https://github.com/maxschmi/max_fun Created by Max Schmit, 2021

weatherDB.utils.dwd.dwd_id_to_str(id)[source]

Convert a station id to normal DWD format as str.

Parameters:

id (int or str) – The id of the station.

Returns:

string of normal DWD Station id.

Return type:

str

weatherDB.utils.dwd.get_ftp_file_list(ftp_conn, ftp_folders)[source]

Get a list of files in the folders with their modification dates.

Parameters:
  • ftp_conn (ftplib.FTP) – Ftp connection.

  • ftp_folders (list of str or pathlike object) – The directories on the ftp server to look for files.

Returns:

A list of Tuples. Every tuple stands for one file. The tuple consists of (filepath, modification date).

Return type:

list of tuples of strs

weatherDB.utils.dwd.get_cdc_file_list(ftp_folders)[source]
weatherDB.utils.dwd.get_dwd_file(zip_filepath)[source]

Get a DataFrame from one single (zip-)file from the DWD FTP server.

Parameters:

zip_filepath (str) – Path to the file on the server. e.g. - “/climate_environment/CDC/observations_germany/climate/10_minutes/air_temperature/recent/10minutenwerte_TU_00044_akt.zip” - “/climate_environment/CDC/derived_germany/soil/daily/historical/derived_germany_soil_daily_historical_73.txt.gz”

Returns:

The DataFrame of the selected file in the zip folder.

Return type:

pandas.DataFrame

weatherDB.utils.dwd.get_dwd_meta(ftp_folder)[source]

Get the meta file from the ftp_folder on the DWD server.

Downloads the meta file of a given folder. Corrects the meta file of missing files. So if no file for the station is in the folder the meta entry gets deleted. Reset “von_datum” in meta file if there is a biger gap than max_hole_d. Delets entries with less years than min_years.

Parameters:

ftp_folder (str) – The path to the directory where to search for the meta file. e.g. “climate_environment/CDC/observations_germany/climate/hourly/precipitation/recent/”.

Returns:

a GeoDataFrame of the meta file

Return type:

geopandas.GeoDataFrame

geometry

A collection of geometry functions.

Based on max_fun package on https://github.com/maxschmi/max_fun Created by Max Schmit, 2021

weatherDB.utils.geometry.polar_line(center_xy, radius, angle)[source]

Create a LineString with polar coodinates.

Parameters:
  • center_xy (list, array or tuple of int or floats) – The X and Y coordinates of the center.

  • radius (int or float) – The radius of the circle.

  • angle (int) – The angle of the portion of the circle in degrees. 0 means east.

Returns:

LineString.

Return type:

shapely.geometry.LineString

weatherDB.utils.geometry.raster2points(raster_np, transform, crs=None)[source]

Polygonize raster array to GeoDataFrame.

Until now this only works for rasters with one band.

Parameters:
  • raster_np (np.array) – The imported raster array.

  • transform (rio.Affine) – The Affine transformation of the raster.

  • crs (str or crs-type, optional) – The coordinate reference system for the raster, by default None

Returns:

The raster Data is in the data column.

Return type:

geopandas.GeoDataFrame

get_data

Some utilities functions to download the needed data for the module to work.

weatherDB.utils.get_data.download_ma_rasters(which='all', overwrite=None, update_user_config=False)[source]

Get the multi annual rasters on which bases the regionalisation is done.

The refined multi annual datasets, that are downloaded are published on Zenodo [1]

References

Parameters:
  • which (str or [str], optional) – Which raster to download. Options are “dwd”, “hyras”, “regnie” and “all”. The default is “all”.

  • overwrite (bool, optional) – Should the multi annual rasters be downloaded even if they already exist? If None the user will be asked. The default is None.

  • update_user_config (bool, optional) – Should the downloaded rasters be set as the regionalisation rasters in the user configuration file? The default is False.

weatherDB.utils.get_data.download_dem(overwrite=None, extent=(5.3, 46.1, 15.6, 55.4), update_user_config=False)[source]

Download the newest DEM data from the Copernicus Sentinel dataset.

Only the GLO-30 DEM, which has a 30m resolution, is downloaded as it is freely available. If you register as a scientific researcher also the EEA-10, with 10 m resolution, is available. You will have to download the data yourself and define it in the configuration file.

After downloading the data, the files are merged and saved as a single tif file in the data directory in a subfolder called ‘DEM’. To use the DEM data in the WeatherDB, you will have to define the path to the tif file in the configuration file.

Source: Copernicus DEM - Global and European Digital Elevation Model. Digital Surface Model (DSM) provided in 3 different resolutions (90m, 30m, 10m) with varying geographical extent (EEA: European and GLO: global) and varying format (INSPIRE, DGED, DTED). DOI:10.5270/ESA-c5d3d65.

Parameters:
  • overwrite (bool, optional) – Should the DEM data be downloaded even if it already exists? If None the user will be asked. The default is None.

  • extent (tuple, optional) – The extent in WGS84 of the DEM data to download. The default is the boundary of germany + ~40km = (5.3, 46.1, 15.6, 55.4).

  • update_user_config (bool, optional) – Should the downloaded DEM be set as the used DEM in the user configuration file? The default is False.

logging

weatherDB.utils.logging.remove_old_logs(max_days=14)[source]
weatherDB.utils.logging.setup_logging_handlers()[source]

Setup the logging handlers depending on the configuration.

Raises:

ValueError – If the handler type is not known.