utils

In this module there are several utilities used by the WeatherDB package, you shouldn’t need to use them directly.

TimestampPeriod

class weatherdb.utils.TimestampPeriod(start, end, tzinfo='UTC')[source]

Bases: object

A class to save a Timespan with a minimal and maximal Timestamp.

Initiate a TimestampPeriod.

Parameters:
  • start (pd.Timestamp or similar) – The start of the Period.

  • end (pd.Timestamp or similar) – The end of the Period.

  • tzinfo (str or datetime.timezone object or None, optional) – The timezone to set to the timestamps. If the timestamps already have a timezone they will get converted. If None, then the timezone is not changed or set. The default is “UTC”.

union(other, how='inner')[source]

Unite 2 TimestampPeriods to one.

Compares the Periods and computes a new one.

Parameters:
  • other (TimestampPeriod) – The other TimestampPeriod with whome to compare.

  • how (str, optional) – How to compare the 2 TimestampPeriods. Can be “inner” or “outer”. “inner”: the maximal Timespan for both is computed. “outer”: The minimal Timespan for both is computed. The default is “inner”.

Returns:

A new TimespanPeriod object uniting both TimestampPeriods.

Return type:

TimestampPeriod

get_period()[source]
has_NaT()[source]

Has the TimestampPeriod at least one NaT.

This means that the start or end is not given. Normally this should never happen, because it makes no sense.

Returns:

True if the TimestampPeriod has at least on NaT. False if the TimestampPeriod has at least a start or a end.

Return type:

bool

has_only_NaT()[source]

Has the TimestampPeriod only NaT, meaning is empty.

This means that the start and end is not given.

Returns:

True if the TimestampPeriod is empty. False if the TimestampPeriod has a start and an end.

Return type:

bool

is_empty()[source]

Is the TimestampPeriod empty.

This means that the start and end is not given.

Returns:

True if the TimestampPeriod is empty. False if the TimestampPeriod has a start and an end.

Return type:

bool

strftime(format='%Y-%m-%d %H:%M:%S')[source]

Convert the TimestampPeriod to a list of strings.

Formates the Timestamp as a string.

Parameters:

format (str, optional) – The Timestamp-format to use. The Default is “%Y-%m-%d %H:%M:%S”

Returns:

A list of the start and end of the TimestampPeriod as formated string.

Return type:

list of 2 strings

inside(other)[source]

Is the TimestampPeriod inside another TimestampPeriod?

Parameters:

other (Timestampperiod or tuple of 2 Timestamp or Timestamp strings) – The other Timestamp to test against. Test if this TimestampPeriod is inside the other.

Returns:

True if this TimestampPeriod is inside the other. Meaning that the start is higher or equal than the others starts and the end is smaller than the others end.

Return type:

bool

contains(other)[source]

Does this TimestampPeriod contain another TimestampPeriod?

Parameters:

other (Timestampperiod or tuple of 2 Timestamp or Timestamp strings) – The other Timestamp to test against. Test if this TimestampPeriod contains the other.

Returns:

True if this TimestampPeriod contains the other. Meaning that the start is smaller or equal than the others starts and the end is higher than the others end.

Return type:

bool

get_sql_format_dict(format="'%Y%m%d %H:%M'")[source]

Get the dictionary to use in sql queries.

Parameters:

format (str, optional) – The Timestamp-format to use. The Default is “’%Y%m%d %H:%M’”

Returns:

a dictionary with 2 keys (min_tstp, max_tstp) and the corresponding Timestamp as formated string.

Return type:

dict

get_interval()[source]

Get the interval of the TimestampPeriod.

Returns:

The interval of this TimestampPeriod. E.G. Timedelta(2 days 12:30:12)

Return type:

pd.Timedelta

get_middle()[source]

Get the middle Timestamp of the TimestampPeriod.

Returns:

The middle Timestamp of this TimestampPeriod.

Return type:

Timestamp

copy()[source]

Copy this TimestampPeriod.

Returns:

a new TimestampPeriod object that is equal to this one.

Return type:

TimestampPeriod

expand_to_timestamp()[source]
set_tz(tzinfo)[source]

Set the TimestampPeriod to a new timezone.

Parameters:

tzinfo (str or datetime.timezone object) – The timezone to set the TimestampPeriod to.

Returns:

This TimestampPeriod.

Return type:

TimestampPeriod

dwd

Some utilities functions to get data from the DWD-CDC server.

Based on max_fun package on https://github.com/maxschmi/max_fun Created by Max Schmit, 2021

weatherdb.utils.dwd.dwd_id_to_str(id)[source]

Convert a station id to normal DWD format as str.

Parameters:

id (int or str) – The id of the station.

Returns:

string of normal DWD Station id.

Return type:

str

weatherdb.utils.dwd.get_ftp_file_list(ftp_conn, ftp_folders)[source]

Get a list of files in the folders with their modification dates.

Parameters:
  • ftp_conn (ftplib.FTP) – Ftp connection.

  • ftp_folders (list of str or pathlike object) – The directories on the ftp server to look for files.

Returns:

A list of Tuples. Every tuple stands for one file. The tuple consists of (filepath, modification date).

Return type:

list of tuples of strs

weatherdb.utils.dwd.get_cdc_file_list(ftp_folders)[source]
weatherdb.utils.dwd.get_dwd_file(zip_filepath)[source]

Get a DataFrame from one single (zip-)file from the DWD FTP server.

Parameters:

zip_filepath (str) – Path to the file on the server. e.g. - “/climate_environment/CDC/observations_germany/climate/10_minutes/air_temperature/recent/10minutenwerte_TU_00044_akt.zip” - “/climate_environment/CDC/derived_germany/soil/daily/historical/derived_germany_soil_daily_historical_73.txt.gz”

Returns:

The DataFrame of the selected file in the zip folder.

Return type:

pandas.DataFrame

weatherdb.utils.dwd.get_dwd_meta(ftp_folder)[source]

Get the meta file from the ftp_folder on the DWD server.

Downloads the meta file of a given folder. Corrects the meta file of missing files. So if no file for the station is in the folder the meta entry gets deleted. Reset “von_datum” in meta file if there is a biger gap than max_hole_d. Delets entries with less years than min_years.

Parameters:

ftp_folder (str) – The path to the directory where to search for the meta file. e.g. “climate_environment/CDC/observations_germany/climate/hourly/precipitation/recent/”.

Returns:

a GeoDataFrame of the meta file

Return type:

geopandas.GeoDataFrame

geometry

A collection of geometry functions.

Based on max_fun package on https://github.com/maxschmi/max_fun Created by Max Schmit, 2021

weatherdb.utils.geometry.polar_line(center_xy, radius, angle)[source]

Create a LineString with polar coodinates.

Parameters:
  • center_xy (list, array or tuple of int or floats) – The X and Y coordinates of the center.

  • radius (int or float) – The radius of the circle.

  • angle (int) – The angle of the portion of the circle in degrees. 0 means east.

Returns:

LineString.

Return type:

shapely.geometry.LineString

weatherdb.utils.geometry.raster2points(raster_np, transform, crs=None)[source]

Polygonize raster array to GeoDataFrame.

Until now this only works for rasters with one band.

Parameters:
  • raster_np (np.array) – The imported raster array.

  • transform (rio.Affine) – The Affine transformation of the raster.

  • crs (str or crs-type, optional) – The coordinate reference system for the raster, by default None

Returns:

The raster Data is in the data column.

Return type:

geopandas.GeoDataFrame

get_data

logging

weatherdb.utils.logging.remove_old_logs(max_days=14)[source]
weatherdb.utils.logging.setup_logging_handlers()[source]

Setup the logging handlers depending on the configuration.

Raises:

ValueError – If the handler type is not known.