Data analysis
This module offers tools to access, arrange, analyse, and store output data from simulations.
A DataDict
can be generated by the methods Model.run()
, Experiment.run()
, and DataDict.load()
.
- class DataDict(*args, **kwargs)[source]
Nested dictionary for output data of simulations. Items can be accessed like attributes. Attributes can differ from the standard ones listed below.
- Variables
info (dict) – Metadata of the simulation.
parameters (DataDict) – Simulation parameters.
variables (DataDict) – Recorded variables, separatedper object type.
reporters (pandas.DataFrame) – Reported outcomes of the simulation.
sensitivity (DataDict) – Sensitivity data, if calculated.
Data arrangement
- DataDict.arrange(variables=False, reporters=False, parameters=False, constants=False, obj_types=True, index=False)[source]
Combines and/or filters data based on passed arguments.
- Parameters
variables (bool or str or list of str, optional) – Key or list of keys of variables to include in the dataframe. If True, all available variables are selected. If False (default), no variables are selected.
reporters (bool or str or list of str, optional) – Key or list of keys of reporters to include in the dataframe. If True, all available reporters are selected. If False (default), no reporters are selected.
parameters (bool or str or list of str, optional) – Key or list of keys of parameters to include in the dataframe. If True, all non-constant parameters are selected. If False (default), no parameters are selected.
constants (bool, optional) – Include constants if ‘parameters’ is True (default False).
obj_types (str or list of str, optional) – Agent and/or environment types to include in the dataframe. If True (default), all objects are selected. If False, no objects are selected.
index (bool, optional) – Whether to keep original multi-index structure (default False).
- Returns
The newly arranged dataframe.
- Return type
- DataDict.arrange_reporters()[source]
Common use case of
DataDict.arrange
with reporters=True and parameters=True.
- DataDict.arrange_variables()[source]
Common use case of
DataDict.arrange
with variables=True and parameters=True.
Analysis methods
- DataDict.calc_sobol(reporters=None, **kwargs)[source]
Calculates Sobol Sensitivity Indices using
SALib.analyze.sobol.analyze()
. Data must be from anExperiment
with aSample
that was generated with the method ‘saltelli’. If the experiment had more than one iteration, the mean value between iterations will be taken.- Parameters
reporters (str or list of str, optional) – The reporters that should be used for the analysis. If none are passed, all existing reporters except ‘seed’ are used.
**kwargs – Will be forwarded to
SALib.analyze.sobol.analyze()
.
- Returns
The DataDict itself with an added category ‘sensitivity’.
- Return type
Save and load
- DataDict.save(exp_name=None, exp_id=None, path='ap_output', display=True)[source]
Writes data to directory {path}/{exp_name}_{exp_id}/.
Works only for entries that are of type
DataDict
,pandas.DataFrame
, or serializable with JSON (int, float, str, dict, list). Numpy objects will be converted to standard objects, if possible.- Parameters
exp_name (str, optional) – Name of the experiment to be saved. If none is passed, self.info[‘model_type’] is used.
exp_id (int, optional) – Number of the experiment. Note that passing an existing id can overwrite existing data. If none is passed, a new id is generated.
path (str, optional) – Target directory (default ‘ap_output’).
display (bool, optional) – Display saving progress (default True).
- classmethod DataDict.load(exp_name=None, exp_id=None, path='ap_output', display=True)[source]
Reads data from directory {path}/{exp_name}_{exp_id}/.
- Parameters
exp_name (str, optional) – Experiment name. If none is passed, the most recent experiment is chosen.
exp_id (int, optional) – Id number of the experiment. If none is passed, the highest available id used.
path (str, optional) – Target directory (default ‘ap_output’).
display (bool, optional) – Display loading progress (default True).
- Returns
The loaded data from the chosen experiment.
- Return type