CLASSIF1 Data Pattern Classification
fig.12: Optimized Data Patterns: Reclassification of Learning Set Patients
- The disease classification mask ("disease signature")
for the reference group of individuals (normals) contains
typically a sequence of (0) characters because the majority (70%)
of the parameter values are located between the two percentiles thresholds
15% and 85%, 15% below (-) the lower percentile and 15% above (+) the
- Unknown test set patients
are classified according to the highest positional
coincidence of the patient classification mask ("patient signature")
with any of the disease classification masks .
- The degree of overall coincidence between the patient mask
and the best fitting disease mask is expressed as
mask coincidence factor. The coincidence factor
is 1.00 for risk patients #137,139,141,142,143,148 but also
for non risk (normal) patients #102,108,109
(arrows <= ) despite the fact that not all
triple matrix characters are (0). Fully coincident infarct risk
patients have all discriminatory parameters of the disease mask
increased (+). Patients with normal (0) or diminished (-) discriminatory
parameter values belong therefore to the non risk (normal) patients.
This means that parameter values (-) are counted as hit for non risk
(normal) patients, explaining the coincidence factor of 1.00 for
- The triple matrix patterns of the first 10 patients of
the normal individuals and the myocardial risk patients are displayed in
the above graph. The missing patients #101,105,110,140,145 belong to the
unknown test set patients,
indicating that they have remained unknown to the learning process.
- The position coincidence factor indicates the degree of
positional coincidence between patient classification mask
and disease classification masks in the vertical direction.
The present concept is that parameters close to the disease process have
higher coincidence factors than those depending on various specific genotypic
and exposure conditions. Information on specific genotype and
exposure conditions leading to disease (-> epidemiological data
pattern analysis) may be obtained in this way.
- Data pattern analysis permits to evaluate the similarity of
data patterns in a formalized way. Patients #137,139,141,142,
143,148 for example show a data pattern of: ++++ as opposed to: ++0++
for patients #138,144,146 in the above graph. The analysis of the data
patterns may permit to detect typical genotypic and exposure conditions
leading at various combinations finally to similar disease processes.
Last Update: Jul 13,2020
First disyplay: Feb 16,2005