The correspondence analysis platform for uncovering deep structure in data and information

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common projected space. The second is modelled by an induced ultrametric. A very general way to achieve a Euclidean embedding of different information spaces based on cross-tabulation counts (and from other input data formats) is provided by correspondence analysis. From there, the induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We employ such a perspective to look at narrative, 'the flow of thought and the flow of language' (Chafe). In application to policy decision making, we show how we can focus analysis in a small number of dimensions.

Original languageEnglish
Pages (from-to)304-315
Number of pages12
JournalComputer Journal
Volume53
Issue number3
DOIs
Publication statusPublished - 1 Mar 2010
Externally publishedYes

Fingerprint

Decision making
Semantics

Cite this

@article{2b6a6381f7e14a7f868a3c8d6478639c,
title = "The correspondence analysis platform for uncovering deep structure in data and information",
abstract = "We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common projected space. The second is modelled by an induced ultrametric. A very general way to achieve a Euclidean embedding of different information spaces based on cross-tabulation counts (and from other input data formats) is provided by correspondence analysis. From there, the induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We employ such a perspective to look at narrative, 'the flow of thought and the flow of language' (Chafe). In application to policy decision making, we show how we can focus analysis in a small number of dimensions.",
keywords = "Artificial intelligence, Content analysis, Indexing, Pattern recognition",
author = "Fionn Murtagh",
year = "2010",
month = "3",
day = "1",
doi = "10.1093/comjnl/bxn045",
language = "English",
volume = "53",
pages = "304--315",
journal = "Computer Journal",
issn = "0010-4620",
publisher = "Oxford University Press",
number = "3",

}

The correspondence analysis platform for uncovering deep structure in data and information. / Murtagh, Fionn.

In: Computer Journal, Vol. 53, No. 3, 01.03.2010, p. 304-315.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The correspondence analysis platform for uncovering deep structure in data and information

AU - Murtagh, Fionn

PY - 2010/3/1

Y1 - 2010/3/1

N2 - We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common projected space. The second is modelled by an induced ultrametric. A very general way to achieve a Euclidean embedding of different information spaces based on cross-tabulation counts (and from other input data formats) is provided by correspondence analysis. From there, the induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We employ such a perspective to look at narrative, 'the flow of thought and the flow of language' (Chafe). In application to policy decision making, we show how we can focus analysis in a small number of dimensions.

AB - We study two aspects of information semantics: (i) the collection of all relationships, (ii) tracking and spotting anomaly and change. The first is implemented by endowing all relevant information spaces with a Euclidean metric in a common projected space. The second is modelled by an induced ultrametric. A very general way to achieve a Euclidean embedding of different information spaces based on cross-tabulation counts (and from other input data formats) is provided by correspondence analysis. From there, the induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We employ such a perspective to look at narrative, 'the flow of thought and the flow of language' (Chafe). In application to policy decision making, we show how we can focus analysis in a small number of dimensions.

KW - Artificial intelligence

KW - Content analysis

KW - Indexing

KW - Pattern recognition

UR - http://www.scopus.com/inward/record.url?scp=77649325965&partnerID=8YFLogxK

U2 - 10.1093/comjnl/bxn045

DO - 10.1093/comjnl/bxn045

M3 - Article

VL - 53

SP - 304

EP - 315

JO - Computer Journal

JF - Computer Journal

SN - 0010-4620

IS - 3

ER -