identify.core

API

cluster

Cluster value(s) on condition(s).

Identificier

Identificier

Identificaiton Base Class.

Identificier.of_high_interest

Node uid representations identified as of high interest.

Identificier.high

Alias for of_high_interest.

Identificier.of_medium_interest

Node uid representations identified as of medium interest.

Identificier.medium

Alias for of_medium_interest.

Identificier.of_low_interest

Node uid representations identified as of low interest.

Identificier.low

Alias for of_low_interest.

Identificier.clustered_interest

Inter component results clustered by interest.

Identificier.cluster_interest

Cluster inter component results by interest.

Identificier.cluster_conditions

Dictionairy of clustering conditions used.

Tessif module providing the core identification utilities.

class tessif.identify.core.Identificier(data, conditions_dict, reference=None)[source]

Bases: ABC

Identificaiton Base Class.

Identificaiton algorithm houses in Identificier.cluster_interest which needs to be overriden by child specific implementations.

Parameters:
  • data – Result data to be analyzed for significant differences.

  • conditions_dict (dict, None, default=None) –

    Dictionairy keying container(s) of dicts by the respective cluster labels “high”, “medium” and “low”. The dictionairies inside the tuples need to have following keywords:

    • thres specyfying the threshold used

    • oprt specifying the operator used.

    Used to cluster data by category/cluster label.

  • reference (str, None, default=None) –

    Defines the reference results to be used for calculating the statistical error values and pearson correlation coeficients.

    In case None is used (default), the dataframes average is used as returned by average_timevarying_dataframe_results().

property of_high_interest

Node uid representations identified as of high interest.

property high

Alias for of_high_interest.

property high_interest_results

Inter component results identified as highly interesting.

property of_medium_interest

Node uid representations identified as of medium interest.

property medium

Alias for of_medium_interest.

property medium_interest_results

Inter component results identified as mediumly interesting.

property of_low_interest

Node uid representations identified as of low interest.

property low

Alias for of_low_interest.

property low_interest_results

Inter component results identified as lowly interesting.

property cluster_conditions

Dictionairy of clustering conditions used.

property clustered_interest

Inter component results clustered by interest.

property reference

Reference Model Used for Ientifications.

abstract cluster_interest()[source]

Cluster inter component results by interest.

abstract map_interest_results(data)[source]

Map data to identified interest categories.

tessif.identify.core.cluster(values, conditions_dict)[source]

Cluster value(s) on condition(s).

Uses a dcitionairy of conditions utilizing pythons operators.

Parameters:
  • values (Container) – Container of number(s) on which the cluster conditions are checked on.

  • conditions_dict (dict) –

    Dictionairy keying container(s) of dicts by the respective cluster labels. The dictionairies inside the tuples need to have following keywords:

    • thres specyfying the threshold used

    • oprt specifying the operator used.

Returns:

Dictionairy key specifying the cluster. Usually a string or a number.

Return type:

Hashable

Examples

Using a single value condition check with 2 categories/clusters. Note that on single value conditions both, the value itself as well as the inner conditions dict must be Containers. Hence the trailing , to turn both into tuples.

>>> values = [(9000,), (9001,), (42,)]
>>> conditions = {
...     "Its over 9000!": ({"oprt": "gt", "thres": 9000},),
...     "Nope": ({"oprt": "le", "thres": 9000},),
... }
>>> for value in values:
...     print(cluster(value, conditions))
Nope
Its over 9000!
Nope

Multiple values and conditions (inner dict tuple length) can be used. Their length must match however:

>>> values = [
...     ([0, 1], "high"),
...     ([1, 1],  "medium1"),
...     ([0, 0],  "medium2"),
...     ([1, 0],  "low"),
... ]
>>> # first condition = pcc, second condition = nmae
>>> conditions = {
...     "high": ({"oprt": "lt", "thres": 0.7}, {"oprt": "ge", "thres": 0.1}),
...     "medium1": ({"oprt": "ge", "thres": 0.7}, {"oprt": "ge", "thres": 0.1}),
...     "medium2": ({"oprt": "lt", "thres": 0.7}, {"oprt": "lt", "thres": 0.1}),
...     "low": ({"oprt": "ge", "thres": 0.7}, {"oprt": "lt", "thres": 0.1}),
... }
>>> for value_pairing in values:
...     print(cluster(value_pairing[0], conditions))
high
medium1
medium2
low