Audiovisual documents representation with Annotations Interconnected Strata for contextual exploitation
PhD Dissertation, University of Lyon (INSA-Lyon)
directed by Alain Mille and Jean-Marie Pinon
Digitization and digital creation of audiovisual (AV) streams now allow their exploitation in audiovisual information systems. It is therefore needed to model AV documents so that random access and all kinds of usages be allowed: search, indexation, navigation... After a review of current proposals for modelling AV documents, we set up some necessities for AV representation. We propose to represent AV documents with Annotation Interconnected Strata (AI-Strata), which means "writing" on the stream with terms (annotation elements). These terms annotate parts of the stream (audiovisual units), are in relation with each other, and are instances of abstract annotation elements that are described in a conceptual relation graph. In fact, the whole system is a global knowledge graph in which we define a notion of context as contextual path extremitate. We then describe some contextual manipulation tools based on the notion of potential graphs. A potential graph represents user's description aim and is directly linked with his task. It instanciates into the global graph (partial subgraph isomorphism search) and we propose an efficient algorithm for it, based on multi-propagation We also propose a model of documentary information systems allowing intelligent stocking of use experience as use cases that can be reused for user assistance. Finally, we discuss the relations between documents and knowledge.
Audiovisual documents and contents modelling, Knowledge representation, Contextual usage, subgraph isomorphism, Annotations-Interconnected Strata, use experience.
Here are some of the research themes I did get interested in during this thesis :
- Information system, information retrieval
- Multimedia documents, mainly audiovisual documents
- Audiovisual document smodelling
- Knowledge representation and documents
- Documentary contexts and contextual use of documents
- Information system use experience
- Graph-based and semi-structured information retrieval algorithms
... and also
- Structured documents and XML
- Audiovisual and multimedia semiotics
- Computer science and Humanities
(if you can read french, go there if you want to download or read the dissertation. You can also ask for reprints for articles that are written in english).
The thesis context is the SESAME project, which means Audiovisual and Multimedia Sequences Exploration enriched by Experience, funded by CNET-France Telecom (contract 96-ME-17, nov. 96 til nov. 99). The initial objective of SESAME was to study the new possibilities that are allowed by digital audiovisual documents, along these four axes : image processing for content indexation, parallel access to video data, video databases, and help to an audiovisual information system user based on exploitation sessions experience. Two industrial partners were involved in the project : a TV Channel, France3 and the local delegation of the french audiovisual institute.
I was mainly interested in the last thematics, and by the following points :
- audiovisual documents representating so as to take into account their essential caracteristics such as temporality, the fact that they are mainly visual, and that many analysis can be conducted on them (signal processing for feature calculation, high conceptual level analysis, etc.)
- indexing for information retrieval and document use within multiple tasks (search for simple vizualisation, analysis, reuse, etc.), all of which cannot be foreseen during indexing, and by users that are all different : the question is that of documentary description sharing
- helping the differents use tasks based on document descriptions, and, in a more general way, knowledge-based document exploitation and relationships between knowledge and documents.
The objective was to try and study different research domains (documentary information retrieval, document description, knowledge-based system) so as to appreciate the problem in a global way.
As we realized that there was no video representation model general enough to cope with our will t for managing documentary knowledge and use experience, I had to design the Annotation Interconnected Strata model (AI-Strata).
This model is based on a enhanced stratification approach, in which the whole system is considered as a graph (example graph, in french) whose nodes are :
- audiovisual units, representing audiovisual documents pieces.
- annotation elements, linked with audiovisual units by relations that are annotation relations. Annotation elements contain the annotation of the audiovisual units they are linked with. Annotation elements can represent any audiovisual characteristic (cf. MPEG7 descriptors). The elementary relation (between two annotation elements) is used to structure the description. By extension, it allows considering relations between audiovisual units, that are not temporal (hence the name of the model : Annotation Interconnected Strata).
- abstract annotation elements, organized in a "knowledge base" (minimally with specialisation/abstraction relations), that represent the annotation vocabulary. Abstract annotation elements are in decontextualizing relation with the annotation elements they represent.
Exploiting the AI-Strata graph means expressing contexts considered as path extremitates in this graph. So as to describe contexts, we designed the notion of potential graphs that are created along the same constraints as the general graph (three types of nodes), with the supplementary possibility of defining generic nodes (with "*"). Potential graphs -as marks and signatures of the user's contextualization will - can be manipulated as such, joined, extended, etc.
A potential graph gp instanciates (examples, in french) in the general graph G if it is possible to find a subgraph g of G so that gp and g are isomorphic (no considering the generic nodes). Two algorithms (recursiv propagation and multi-propagation) have been designed for potential graph instanciation. They benefit from a reasoniblec cut in the search space, related to the fact that each potential graph must have at least a node perfectly known : known audiovisual unit, abstract annotation element (unique by definition), or annotation element explicitely assigned by the user.
The AI-Strata model allows us to solve the conflict between a priori segmentation approach and stratification. Moreover, it can handle audiovisual contexts, considering that any annotation participate to a structure, which serves as support for contextual annotation.
Some high level tools have been designed upon potential graphs and allow us to exploit a AI-Strata system always considering contexts. Analysis dimensions help grouping the abstract annotation elements that reveal useful for a precise annotation task. Analaysis dimensions can be manipulated (jonction, fusion). An analysis dimension always leads to potential graphs designating abstract annotation elements that should be used to annotate.
We consider that using an audiovisual information system (indexing, navigating, searching, analysing, editing) is always describing an audiovisual document. Description schemes are particular graph allowing to describe annotation schemes (audiovisual units that should be created, annotation elements that annotate them, constraints on their attributes, relations between the elements). Description schemes use analysis dimensions and represent local annotation schemes, hence potential graphs can be extracted from them, so as to search the base.
A first prototype validates the graph approach (instanciation) and a second allows exploiting knowledge graphs with a graphical interface.
An important result of the research is the AI-Strata model for audiovisual document representation: this model allows a free description (that is "written" on the stream), without considering that there must be a documentary structure upon which every description whould be based. Indeed, the classical documentary approach supposes a same way to use a document for any user, and of course a same way to describe it, which constraints and freeze indexing and retrieval protocols. However, in our problem, our objective that description be reused within various tasks calls for a documentary content description framework that should be homogeneous, and a contextual reuse of the descriptions. This framework also allows considering with reasona (ie. with an explicit point of view) associating knowledge of low and high abstraction level.
Beyond these flexible and powerful modelling principles, a second important result is the proposal of tools and generic mechanisms for contextual exploitation of AI-Strata. The notion of potential graphs that instanciate in the AI-Strata general graph is a base tool upon which every other is build. The instanciation algorithm we propose has some interesting characteristics, such as being anytime and parametrized by a simple heuristics which is easely changeable. Description schemes describe how to describe documents, and are expressed in a very similar way as the descriptions themselves. It is then easy to go from descriptions (really used) to descriptions schemes that will be organized and reused. Contextual exploitation allows to draw a line between user's task and will (that cannot be directly known) and their expression as potential graphs, ie. contextualization will, which can be manipulated.
Finally, we also propose that the description of a documentary information system be completed with a use model (for instance AI-Strata) and simplified but explicit tasks models that allow storing use experience rationalized by knowledge (as opposed to "raw" trace without reference to any knowledge base) into use cases (different from UML). We claim that this contributes to opening promiseful directions for experience based user help.
Go there for personal papers references related to AI-Strata modelling and contextual use.
Here are all the references I did use in my dissertation.
The project I was involved in continues into another one, which will allow us to pursue developments around my representation model, to study a Ai-Strata knowledge base on a larger scale, to pursue user experience modelling for learning and user's help. Of course we must keep an eye on the developments of MPEG-7.
Research perspectives around the AI-Strata modelling principles lie in the following points
- study of the application of AI-Strata to non-temporal sequential documents (like texts)
- study of the modelling of documents for which avery structure is semantical, and for which it is usage that gives structure
- study of contextualizing in AI-Strata, linked with user's task
- study of annotation dynamics (knowledge-based annotation tools)
- study of de-temporalized representations for temporal documents
Other research perspectives, more general, lie around the important thematic of computer science and man (as active being).