Data analysis

This module offers tools to access, arrange, analyse, and store output data from simulations. A DataDict can be generated by the methods Model.run(), Experiment.run(), and DataDict.load().

class DataDict(*args, **kwargs)[source]

Nested dictionary for output data of simulations. Items can be accessed like attributes. Attributes can differ from the standard ones listed below.

Variables
  • info (dict) – Metadata of the simulation.

  • parameters (DataDict) – Simulation parameters.

  • variables (DataDict) – Recorded variables, separatedper object type.

  • reporters (pandas.DataFrame) – Reported outcomes of the simulation.

  • sensitivity (DataDict) – Sensitivity data, if calculated.

Data arrangement

DataDict.arrange(variables=False, reporters=False, parameters=False, constants=False, obj_types=True, index=False)[source]

Combines and/or filters data based on passed arguments.

Parameters
  • variables (bool or str or list of str, optional) – Key or list of keys of variables to include in the dataframe. If True, all available variables are selected. If False (default), no variables are selected.

  • reporters (bool or str or list of str, optional) – Key or list of keys of reporters to include in the dataframe. If True, all available reporters are selected. If False (default), no reporters are selected.

  • parameters (bool or str or list of str, optional) – Key or list of keys of parameters to include in the dataframe. If True, all non-constant parameters are selected. If False (default), no parameters are selected.

  • constants (bool, optional) – Include constants if ‘parameters’ is True (default False).

  • obj_types (str or list of str, optional) – Agent and/or environment types to include in the dataframe. If True (default), all objects are selected. If False, no objects are selected.

  • index (bool, optional) – Whether to keep original multi-index structure (default False).

Returns

The newly arranged dataframe.

Return type

pandas.DataFrame

DataDict.arrange_reporters()[source]

Common use case of DataDict.arrange with reporters=True and parameters=True.

DataDict.arrange_variables()[source]

Common use case of DataDict.arrange with variables=True and parameters=True.

Analysis methods

DataDict.calc_sobol(reporters=None, **kwargs)[source]

Calculates Sobol Sensitivity Indices using SALib.analyze.sobol.analyze(). Data must be from an Experiment with a Sample that was generated with the method ‘saltelli’. If the experiment had more than one iteration, the mean value between iterations will be taken.

Parameters
  • reporters (str or list of str, optional) – The reporters that should be used for the analysis. If none are passed, all existing reporters except ‘seed’ are used.

  • **kwargs – Will be forwarded to SALib.analyze.sobol.analyze().

Returns

The DataDict itself with an added category ‘sensitivity’.

Return type

DataDict

Save and load

DataDict.save(exp_name=None, exp_id=None, path='ap_output', display=True)[source]

Writes data to directory {path}/{exp_name}_{exp_id}/.

Works only for entries that are of type DataDict, pandas.DataFrame, or serializable with JSON (int, float, str, dict, list). Numpy objects will be converted to standard objects, if possible.

Parameters
  • exp_name (str, optional) – Name of the experiment to be saved. If none is passed, self.info[‘model_type’] is used.

  • exp_id (int, optional) – Number of the experiment. Note that passing an existing id can overwrite existing data. If none is passed, a new id is generated.

  • path (str, optional) – Target directory (default ‘ap_output’).

  • display (bool, optional) – Display saving progress (default True).

classmethod DataDict.load(exp_name=None, exp_id=None, path='ap_output', display=True)[source]

Reads data from directory {path}/{exp_name}_{exp_id}/.

Parameters
  • exp_name (str, optional) – Experiment name. If none is passed, the most recent experiment is chosen.

  • exp_id (int, optional) – Id number of the experiment. If none is passed, the highest available id used.

  • path (str, optional) – Target directory (default ‘ap_output’).

  • display (bool, optional) – Display loading progress (default True).

Returns

The loaded data from the chosen experiment.

Return type

DataDict