DispaSET.preprocessing package

Submodules

DispaSET.preprocessing.data_check module

This files gathers different functions used in the DispaSET to check the input data

__author__ = ‘Sylvain Quoilin (sylvain.quoilin@ec.europa.eu)’

DispaSET.preprocessing.data_check.check_AvailabilityFactors(plants, AF)[source]

Function that checks the validity of the provided availability factors and warns if a default value of 100% is used.

DispaSET.preprocessing.data_check.check_MinMaxFlows(df_min, df_max)[source]

Function that checks that there is no incompatibility between the minimum and maximum flows

DispaSET.preprocessing.data_check.check_chp(config, plants)[source]

Function that checks the CHP plant characteristics

DispaSET.preprocessing.data_check.check_clustering(plants, plants_merged)[source]

Function that checks that the installed capacities are still equal after the clustering process

Parameters:
  • plants – Non-clustered list of units
  • plants_merged – clustered list of units
DispaSET.preprocessing.data_check.check_df(df, StartDate=None, StopDate=None, name='')[source]

Function that check the time series provided as inputs

DispaSET.preprocessing.data_check.check_heat_demand(plants, data)[source]

Function that checks the validity of the heat demand profiles

Parameters:plants – List of CHP plants
DispaSET.preprocessing.data_check.check_simulation_environment(SimulationPath, store_type='pickle', firstline=7)[source]

Function to test the validity of disapset inputs :param SimulationPath: Path to the simulation folder :param store_type: choose between: “list”, “excel”, “pickle” :param firstline: Number of the first line in the data (only if type==’excel’)

DispaSET.preprocessing.data_check.check_sto(config, plants, raw_data=True)[source]

Function that checks the storage plant characteristics

DispaSET.preprocessing.data_check.check_units(config, plants)[source]

Function that checks the power plant characteristics

DispaSET.preprocessing.data_check.isStorage(tech)[source]

Function that returns true the technology is a storage technology

DispaSET.preprocessing.data_check.isVRE(tech)[source]

Function that returns true the technology is a variable renewable energy technology

DispaSET.preprocessing.data_handler module

DispaSET.preprocessing.data_handler.NodeBasedTable(path, idx, countries, tablename='', default=None)[source]

This function loads the tabular data stored in csv files relative to each zone (a.k.a node, country) of the simulation.

Parameters:
  • path – Path to the data to be loaded
  • idx – Pandas datetime index to be used for the output
  • countries – List with the country codes to be considered
  • fallback – List with the order of data source.
  • tablename – String with the name of the table being processed
  • default – Default value to be applied if no data is found
Returns:

Dataframe with the time series for each unit

DispaSET.preprocessing.data_handler.UnitBasedTable(plants, path, idx, countries, fallbacks=['Unit'], tablename='', default=None, RestrictWarning=None)[source]

This function loads the tabular data stored in csv files and assigns the proper values to each unit of the plants dataframe. If the unit-specific value is not found in the data, the script can fallback on more generic data (e.g. fuel-based, technology-based, zone-based) or to the default value. The order in which the data should be loaded is specified in the fallback list. For example, [‘Unit’,’Technology’] means that the script will first try to find a perfect match for the unit name in the data table. If not found, a column with the unit technology as header is search. If not found, the default value is assigned.

Parameters:
  • plants – Dataframe with the units for which data is required
  • path – Path to the data to be loaded
  • idx – Pandas datetime index to be used for the output
  • countries – List with the country codes to be considered
  • fallback – List with the order of data source.
  • tablename – String with the name of the table being processed
  • default – Default value to be applied if no data is found
  • RestrictWarning – Only display the warnings if the unit belongs to the list of technologies provided in this parameter
Returns:

Dataframe with the time series for each unit

DispaSET.preprocessing.data_handler.define_parameter(sets_in, sets, value=0)[source]

Function to define a DispaSET parameter and fill it with a constant value

Parameters:
  • sets_in – List with the labels of the sets corresponding to the parameter
  • sets – dictionary containing the definition of all the sets (must comprise those referenced in sets_in)
  • value – Default value to attribute to the parameter
DispaSET.preprocessing.data_handler.invert_dic_df(dic, tablename='')[source]

Function that takes as input a dictionary of dataframes, and inverts the key of the dictionary with the columns headers of the dataframes

Parameters:
  • dic – dictionary of dataframes, with the same columns headers and the same index
  • tablename – string with the name of the table being processed (for the error msg)
Returns:

dictionary of dataframes, with swapped headers

DispaSET.preprocessing.data_handler.load_config_excel(ConfigFile)[source]

Function that loads the DispaSET excel config file and returns a dictionary with the values

Parameters:ConfigFile – String with (relative) path to the DispaSET excel configuration file
DispaSET.preprocessing.data_handler.load_config_yaml(filename)[source]

Loads YAML file to dictionary

DispaSET.preprocessing.data_handler.load_csv(filename, TempPath='.pickle', header=0, skiprows=None, skipfooter=0, index_col=None, parse_dates=False)[source]

Function that loads an xls sheet into a dataframe and saves a temporary pickle version of it. If the pickle is newer than the sheet, do no load the sheet again.

Parameters:
  • file_excel – path to the excel file
  • TempPath – path to store the temporary data files
DispaSET.preprocessing.data_handler.merge_series(plants, data, mapping, method='WeightedAverage', tablename='')[source]

Function that merges the times series corresponding to the merged units (e.g. outages, inflows, etc.)

Parameters:
  • plants – Pandas dataframe with the information relative to the original units
  • data – Pandas dataframe with the time series and the original unit names as column header
  • mapping – Mapping between the merged units and the original units. Output of the clustering function
  • method – Select the merging method (‘WeightedAverage’/’Sum’)
  • tablename – Name of the table being processed (e.g. ‘Outages’), used in the warnings
Return merged:

Pandas dataframe with the merged time series when necessary

DispaSET.preprocessing.data_handler.write_to_excel(xls_out, list_vars)[source]

Function that reads all the variables (in list_vars) and inserts them one by one to excel

Parameters:
  • xls_out – The path of the folder where the excel files are to be written
  • list_vars – List containing the dispaset variables
Returns:

Binary variable (True)

DispaSET.preprocessing.preprocessing module

This is the main file of the DispaSET pre-processing tool. It comprises a single function that generated the DispaSET simulation environment.

@author: S. Quoilin

DispaSET.preprocessing.preprocessing.adjust_capacity(inputs, tech_fuel, scaling=1, value=None, singleunit=False, write_gdx=False, dest_path='')[source]

Function used to modify the installed capacities in the Dispa-SET generated input data The function update the Inputs.p file in the simulation directory at each call

Parameters:
  • inputs – Input data dictionary OR path to the simulation directory containing Inputs.p
  • tech_fuel – tuple with the technology and fuel type for which the capacity should be modified
  • scaling – Scaling factor to be applied to the installed capacity
  • value – Absolute value of the desired capacity (! Applied only if scaling != 1 !)
  • singleunit – Set to true if the technology should remain lumped in a single unit
  • write_gdx – boolean defining if Inputs.gdx should be also overwritten with the new data
  • dest_path – Simulation environment path to write the new input data. If unspecified, no data is written!
Returns:

New SimData dictionary

DispaSET.preprocessing.preprocessing.adjust_storage(inputs, tech_fuel, scaling=1, value=None, write_gdx=False, dest_path='')[source]

Function used to modify the storage capacities in the Dispa-SET generated input data The function update the Inputs.p file in the simulation directory at each call

Parameters:
  • inputs – Input data dictionary OR path to the simulation directory containing Inputs.p
  • tech_fuel – tuple with the technology and fuel type for which the capacity should be modified
  • scaling – Scaling factor to be applied to the installed capacity
  • value – Absolute value of the desired capacity (! Applied only if scaling != 1 !)
  • write_gdx – boolean defining if Inputs.gdx should be also overwritten with the new data
  • dest_path – Simulation environment path to write the new input data. If unspecified, no data is written!
Returns:

New SimData dictionary

DispaSET.preprocessing.preprocessing.build_simulation(config)[source]

This function reads the DispaSET config, loads the specified data, processes it when needed, and formats it in the proper DispaSET format. The output of the function is a directory with all inputs and simulation files required to run a DispaSET simulation

Parameters:
  • config – Dictionary with all the configuration fields loaded from the excel file. Output of the ‘LoadConfig’ function.
  • plot_load – Boolean used to display a plot of the demand curves in the different zones
DispaSET.preprocessing.preprocessing.get_git_revision_tag()[source]

Get version of DispaSET used for this run. tag + commit hash

DispaSET.preprocessing.utils module

This file gathers different functions used in the DispaSET pre-processing tools

@author: Sylvain Quoilin (sylvain.quoilin@ec.europa.eu)

DispaSET.preprocessing.utils.clustering(plants, method='Standard', Nslices=20, PartLoadMax=0.1, Pmax=30)[source]

Merge excessively disaggregated power Units.

Parameters:
  • plants – Pandas dataframe with each power plant and their characteristics (following the DispaSET format)
  • method – Select clustering method (‘Standard’/’LP’/None)
  • Nslices – Number of slices used to fingerprint each power plant characteristics. slices in the power plant data to categorize them (fewer slices involves that the plants will be aggregated more easily)
  • PartLoadMax – Maximum part-load capability for the unit to be clustered
  • Pmax – Maximum power for the unit to be clustered
Returns:

A list with the merged plants and the mapping between the original and merged units

DispaSET.preprocessing.utils.incidence_matrix(sets, set_used, parameters, param_used)[source]

This function generates the incidence matrix of the lines within the nodes A particular case is considered for the node “Rest Of the World”, which is no explicitely defined in DispaSET

DispaSET.preprocessing.utils.interconnections(Simulation_list, NTC_inter, Historical_flows)[source]

Function that checks for the possible interconnections of the countries included in the simulation. If the interconnections occurs between two of the countries defined by the user to perform the simulation with, it extracts the NTC between those two countries. If the interconnection occurs between one of the countries selected by the user and one country outside the simulation, it extracts the physical flows; it does so for each pair (country inside-country outside) and sums them together creating the interconnection of this country with the RoW.

Parameters:
  • Simulation_list – List of simulated countries
  • NTC – Day-ahead net transfer capacities (pd dataframe)
  • Historical_flows – Historical flows (pd dataframe)

Module contents