Semaine du Document Numérique et de la Recherche d’Information 2014

Après le succès des éditions communes en 2010 à Sousse et en 2012 à Bordeaux de leurs conférences respectives l’ARIA (Association francophone de Recherche d’Information et Applications) et le GRCE (Groupement de Recherche en Communication Ecrite) ont décidé d’organiser simultanément les conférences CORIA et CIFED en mars 2014 à Nancy.

CORIA (COnférence en Recherche d’Information et Applications) et CIFED (Colloque International Francophone sur l’Écrit et le Document) sont les points de rassemblement des communautés francophones respectivement en recherche d’information et en analyse de l’écrit et des documents numérisés. Tout en préservant les spécificités de chaque conférence, cette édition constituera une opportunité pour les deux communautés de se retrouver autour de thématiques pour lesquelles il existe des synergies de recherche (recherche de documents multimédia, modèle d’interaction avec l’utilisateur, passage à l’échelle de système de recherche de d’information, outils d’évaluation de performance pour la recherche d’information). L’objectif est le rassemblement de plus de 120 participants lors d’ateliers et de conférences invitées communes.


  • En amont de la SDNRI, l’atelier de Recherche d’Information Sémantique RISE aura lieu le mardi 18 mars 2014
  • Pour la première fois, une journée Hackday  est organisée le 18 mars avant le début de la SDNRI. Cette journée  a pour but de fédérer et dynamiser les communautés CORIA et CIFED autour d’un défi centré sur la Recherche d’Information dans des documents

Conférenciers invités  :


Alex Graves

Google Deep Mind, London

United Kingdom

Bio: Alex Graves studied theoretical physics at the Universities of Edinburgh and Cambridge before switching fields (and countries) to gain a Ph.D. from the Dalle Molle Institute for Artificial Intelligence in Lugano, Switzerland. Throughout his PhD, and during the postdocs that followed at the Technical University of Munich and the University of Toronto, his research focused on the application of recurrent neural networks to sequential data, especially perceptual sequences such as audio and video. The big question his work addresses is memory: how we decide what to retain and how we use it predict the future. His algorithms have found commercial application in speech and handwriting recognition, and now form part of a system deployed by postal services worldwide to automate the reading of envelope addresses. He is currently a senior researcher at DeepMind Technologies in London.

Talk : Teaching Computers to Read and Write: Recent Advances in Cursive Handwriting Recognition and Synthesis with Recurrent Neural Networks

Cursive handwriting has several properties that make it an interesting challenge for machine learning, including the wide diversity of letter forms, the need to integrate context information, and the difficulty of segmenting into individual characters. This talk presents a suite of recurrent neural network models able to transcribe and generate cursive handwriting, leading to state-of-the-art performance in both areas.



Iadh Ounis

University of Glasgow


Bio : Iadh Ounis is an Associate Professor in the School of Computing Science at the University of Glasgow. His research centers around the development of novel probabilistic information retrieval (IR) models and data-driven learning approaches for web, enterprise, social media and smart cities search. He leads a team of researchers investigating new models and applications in the general field of large-scale text retrieval. Dr Ounis is the principal investigator of the open source Terrier search engine, widely used both in academia and industry for large-scale text mining and search applications. He led many international IR projects and initiatives (e.g. the TREC Blog and Microblog track initiatives), chaired a number of major information retrieval-related events (e.g. ECIR 2008, CIKM 2011), and is on all program committees of major IR conferences and related events. Over the years, he has supervised numerous PhD students and postdoctoral research assistants on the general topic of large-scale information retrieval, and has authored over 150 refereed articles and publications.

Talk : Drowning in Data: On Big Data Streams in IR for Detecting, Tracking and Summarising Events on Twitter

Information Retrieval (IR) is currently in a transition period where many classical tasks are being re-examined in light of the emergence of new Big Data streams. In this talk, I will discuss four key challenges currently facing the IR community with respect to big data streams, namely: the shift from batch to stream processing; the physics of tackling Big Data; how to develop systems that can scale robustly to achieve high steam throughputs; and how to evaluate the effectiveness of stream processing systems over time. In particular, I will describe a recent project in which we investigated how to perform real-time and scalable event detection, tracking and summarisation from the Twitter stream, which illustrates these four challenges. I will detail the different IR components that comprised the project (event detection, real-time search and event summarisation), how each is affected by the aforementioned challenges and highlight some tools and techniques that can be used to help overcome those challenges.


Stéphane Marchand-Maillet

University of Geneva


Bio : Stéphane Marchand-Maillet is Associate Professor in the Department of Computer Science at the University of Geneva, Switzerland, where he leads the Viper group. His research is directed towards large-scale, distributed information indexing, mining and retrieval with emphasis on content understanding using Machine Learning strategies, with applications in Business and Education. He has authored, co-authored or edited a number of publications on these topics. He and his group are part of several national and European and international projects in the domain.
He has recently served as Chair of the Technical Committee 12 of the International Association for Pattern Recognition (IAPR-TC12, « Multimedia and Visual Information Systems »). He was the general co-chair of the ACM International Conference on Image and Video Retrieval (ACM-CIVR 2009). He was also the general co-chair of the International Conference of the ACM-SIG on Information Retrieval in 2010 (ACM-SIGIR 2010).

Talk : Digesting Large-scale data

Large data volumes are a challenge for all systems from storage to analysis and retrieval. In this talk, after investigating properties of large-scale data, we explore two main ways of coping with such volumes. First concentrating on the process, we discuss efficient scalable indexing techniques and their mapping onto parallel architectures. We then take a closer look at the data itself and explore the notion of quality in the relationship from data to information.