dispaset.preprocessing package

Submodules

dispaset.preprocessing.data_check module

This files gathers different functions used in the DispaSET to check the input data

__author__ = ‘Sylvain Quoilin (sylvain.quoilin@ec.europa.eu)’

dispaset.preprocessing.data_check.check_AvailabilityFactors(plants, AF)[source]

Function that checks the validity of the provided availability factors and warns if a default value of 100% is used.

dispaset.preprocessing.data_check.check_BSFlexMaxCapacity(parameters, config, sets)[source]

Function that checks if SectorXFlexMaxCapacity is at least as high as the average flexible demand input

dispaset.preprocessing.data_check.check_BSFlexMaxSupply(parameters, config, sets)[source]

Function that checks if SectorXFlexMaxSupply is at least as high as the average flexible supply input

dispaset.preprocessing.data_check.check_CostXNotServed(config, CostXNotServed, zones_bs)[source]

Check that CostXNotServed is properly defined for all boundary sectors

Parameters:
  • config – Dictionary with all the configuration fields loaded from the excel file

  • CostXNotServed – DataFrame with CostXNotServed values

  • zones_bs – List of boundary sector zones

dispaset.preprocessing.data_check.check_FFRLimit(FFRLimit, Load)[source]

Function that checks the validity of the reserve requirement time series :param FFR: DataFrame of FFR Limit :param Load: DataFrame of Loads

dispaset.preprocessing.data_check.check_FlexibleDemand(flex)[source]

Function that checks the validity of the provided flexibility demand time series

dispaset.preprocessing.data_check.check_MinMaxFlows(df_min, df_max)[source]

Function that checks that there is no incompatibility between the minimum and maximum flows

dispaset.preprocessing.data_check.check_NonNaNKeys(plants, NonNaNKeys)[source]

Checking if keys are of type NonNaN

Parameters:
  • plants – plants dataframe

  • NonNaNKeys – list of NonNaN keys

dispaset.preprocessing.data_check.check_PrimaryReserveLimit(PrimaryReserveLimit, Load)[source]

Function that checks the validity of the reserve requirement time series :param PrimaryReserve: DataFrame of Primary Reserve Limit :param Load: DataFrame of Loads

dispaset.preprocessing.data_check.check_StrKeys(plants, StrKeys)[source]

Checking if keys are of type Str

Parameters:
  • plants – plants dataframe

  • StrKeys – list of Str keys

dispaset.preprocessing.data_check.check_boundary_sector(config, plants, BoundarySector=None)[source]

Check boundary sector units characteristics :param config: Config dictionary :param plants: DataFrame with plants data :param BoundarySector: Optional DataFrame with boundary sector data for consistency check

dispaset.preprocessing.data_check.check_chp(config, plants)[source]

Function that checks the CHP plant characteristics

dispaset.preprocessing.data_check.check_clustering(plants, plants_merged)[source]

Function that checks that the installed capacities are still equal after the clustering process

Parameters:
  • plants – Non-clustered list of units

  • plants_merged – clustered list of units

dispaset.preprocessing.data_check.check_df(df, StartDate=None, StopDate=None, name='')[source]

Function that check the time series provided as inputs

dispaset.preprocessing.data_check.check_grid_data(lines, PTDF, config)[source]

Checks the consistency between NTC data, PTDF matrix, and configured zones.

Ensures that all transmission lines in the NTC dataframe are present as rows in the PTDF matrix, and all configured zones are present as columns in the PTDF matrix.

Parameters:
  • lines – DataFrame containing Net Transfer Capacities.

  • PTDF – DataFrame containing the Power Transfer Distribution Factor matrix.

  • config – Dictionary containing the simulation configuration, including zones.

Returns:

True if checks pass, otherwise logs errors and exits.

dispaset.preprocessing.data_check.check_heat_demand(plants, data, zones_th)[source]

Function that checks the validity of the heat demand profiles :param plants: List of plants :param data: Dataframe with the heat demand time series :param zones_th: list with the heating zones

dispaset.preprocessing.data_check.check_keys(plants, keys, unit)[source]

Checking mandatory keys

Parameters:
  • plants – plants dataframe

  • keys – list of keys

  • unit – string denoting type of units being checked

dispaset.preprocessing.data_check.check_p2bs(config, plants)[source]

Function that checks the p2bs unit characteristics

dispaset.preprocessing.data_check.check_reserves(Reserve2D, Reserve2U, Load)[source]

Function that checks the validity of the reserve requirement time series :param Reserve2D: DataFrame of reserves 2D :param Reserve2U: DataFrame of reserves 2U :param Load: DataFrame of Loads

dispaset.preprocessing.data_check.check_simulation_environment(SimulationPath, store_type='pickle', firstline=7)[source]

Function to test the validity of disapset inputs :param SimulationPath: Path to the simulation folder :param store_type: choose between: “list”, “excel”, “pickle” :param firstline: Number of the first line in the data (only if type==’excel’)

dispaset.preprocessing.data_check.check_sto(config, plants, raw_data=True)[source]

Function that checks the storage plant characteristics

dispaset.preprocessing.data_check.check_units(config, plants)[source]

Function that checks the power plant characteristics

dispaset.preprocessing.data_check.isStorage(tech)[source]

Function that returns true the technology is a storage technology

dispaset.preprocessing.data_check.isVRE(tech)[source]

Function that returns true the technology is a variable renewable energy technology

dispaset.preprocessing.data_handler module

dispaset.preprocessing.data_handler.GenericTable(headers, varname, config, default=None)[source]

This function loads the tabular data stored in csv files and assigns the proper values to each pre-specified column. If not found, the default value is assigned. :param headers: List with the column headers to be read :param varname: Variable to be read :param config: Config variable :param default: Default value to be applied if no data is found

Returns:

Dataframe with the time series for each unit

dispaset.preprocessing.data_handler.NodeBasedTable(varname, config, default=None)[source]

This function loads the tabular data stored in csv files relative to each zone of the simulation.

Parameters:
  • varname – Variable name (as defined in config)

  • config – Dispa-SET config data

  • default – Default value to be applied if no data is found

Returns:

Dataframe with the time series for each unit

dispaset.preprocessing.data_handler.UnitBasedTable(plants, varname, config, fallbacks=['Unit'], default=None, RestrictWarning=None)[source]

This function loads the tabular data stored in csv files and assigns the proper values to each unit of the plants dataframe. If the unit-specific value is not found in the data, the script can fallback on more generic data (e.g. fuel-based, technology-based, zone-based) or to the default value. The order in which the data should be loaded is specified in the fallback list. For example, [‘Unit’,’Technology’] means that the script will first try to find a perfect match for the unit name in the data table. If not found, a column with the unit technology as header is search. If not found, the default value is assigned.

Parameters:
  • plants – Dataframe with the units for which data is required

  • varname – Variable name (as defined in config)

  • config – Dispa-SET config file

  • fallbacks – List with the order of data source.

  • default – Default value to be applied if no data is found

  • RestrictWarning – Only display the warnings if the unit belongs to the list of technologies provided in this parameter

Returns:

Dataframe with the time series for each unit

dispaset.preprocessing.data_handler.define_parameter(sets_in, sets, value=0)[source]

Function to define a DispaSET parameter and fill it with a constant value

Parameters:
  • sets_in – List with the labels of the sets corresponding to the parameter

  • sets – dictionary containing the definition of all the sets (must comprise those referenced in sets_in)

  • value – Default value to attribute to the parameter

dispaset.preprocessing.data_handler.export_yaml_config(ExcelFile, YAMLFile)[source]

Function that loads the DispaSET excel config file and dumps it as a yaml file.

Parameters:
  • ExcelFile – Path to the Excel config file

  • YAMLFile – Path to the YAML config file to be written

dispaset.preprocessing.data_handler.load_config(ConfigFile, AbsPath=True)[source]

Wrapper function around load_config_excel and load_config_yaml

dispaset.preprocessing.data_handler.load_config_excel(ConfigFile, AbsPath=True)[source]

Function that loads the DispaSET excel config file and returns a dictionary with the values

Parameters:
  • ConfigFile – String with (relative) path to the DispaSET excel configuration file

  • AbsPath – If true, relative paths are automatically changed into absolute paths (recommended)

dispaset.preprocessing.data_handler.load_config_yaml(filename, AbsPath=True)[source]

Loads YAML file to dictionary

dispaset.preprocessing.data_handler.load_geo_data(path, header=None)[source]

Load geo data for individual zones.

Parameters:
  • path – absolute path to the geo data file

  • header – load header

dispaset.preprocessing.data_handler.load_time_series(config, path, header='infer')[source]

Function that loads time series data, checks the compatibility of the indexes and guesses when no exact match between the required index and the data is present

Param:

config dispaset config

Param:

path path towards the desired timeseries

Param:

header list of header names

Returns:

reindexed timeseries

dispaset.preprocessing.data_handler.merge_series(plants, oldplants, data, method='WeightedAverage', tablename='')[source]

Function that merges the times series corresponding to the merged units (e.g. outages, inflows, etc.)

Parameters:
  • plants – Pandas dataframe with final units after clustering (must contain ‘FormerUnits’)

  • oldplants – Pandas dataframe with the original units

  • data – Pandas dataframe with the time series and the original unit names as column header

  • method – Select the merging method (‘WeightedAverage’/’Sum’)

  • tablename – Name of the table being processed (e.g. ‘Outages’), used in the warnings

Return merged:

Pandas dataframe with the merged time series when necessary

dispaset.preprocessing.data_handler.read_Participation(sheet, rowstart, colstart, rowstop, colapart=1)[source]

Creates dict for each technology and add 0 for false and 1 for true (first value for without CHP second with CHP) :param sheet: Excel sheet to load data from :param rowstart: Row to start reading the data :param colstart: Column to start reading the data :param rowstop: Row to stop reading the data :param colapart: Columns apart to read the data :return:

dispaset.preprocessing.data_handler.read_truefalse(sheet, rowstart, colstart, rowstop, colstop, colapart=1)[source]

Function that reads a two column format with a list of strings in the first columns and a list of true false in the second column The list of strings associated with a True value is returned

dispaset.preprocessing.preprocessing module

dispaset.preprocessing.utils module

Module contents