Karine Zeitouni: pioneer of spatial data mining in France
Karine Zeitouni is a professor and lecturer at the Data and Algorithms for an Intelligent and Sustainable City Laboratory (DAVID - Univ. Paris-Saclay, UVSQ) and Deputy Director of Education at the Computer Sciences Graduate School of Université Paris-Saclay. An expert in massive data management, she was the first person to import spatial data mining into France.
Originally from Morocco, Karine Zeitouni began studying mathematics and physics at the University of Rabat in 1983. She also studied computer science courses at the Master's level and became passionate about the subject. She continued her studies at Université de Strasbourg, where, from 1984 to 1987, she completed a Bachelor's degree and a Master's degree in computer science, then a DEA in graphic and image processing. To finance her studies, she taught computer science to university technology diploma (DUT) students as an associate assistant (a teaching load equivalent to that of a lecturer). In 1991, she defended her PhD thesis at Université Pierre et Marie Curie; it focused on the integration of spatial reasoning in geographic databases for mapping purposes. She wrote it in collaboration with the National Institute of Geographic and Forest Information (IGN). In the same year, Karine Zeitouni obtained a position as a temporary study and research assistant (ATER) at Université Paris-Diderot. In 1992, she became a lecturer and joined the Parallelism, Networks, Systems, Modelling (PRiSM) laboratory, from which the DAVID laboratory was created in 2015. She was promoted to professor in 2009. Karine Zeitouni continues her research in this laboratory today and teaches at the Technical Institute (IUT) of Vélizy, a constituent institute of Université de Versailles - Saint-Quentin-en-Yvelines (UVSQ).
Spatial databases
Karine Zeitouni's research area covers data processing, query optimisation and massive data mining to extract useful knowledge. "The boundaries are quite blurred in this field, but it also includes algorithmics and machine learning (a subfield of artificial intelligence)." For spatial and spatiotemporal data, it is necessary to deal with modelling issues, i.e., abstraction of complex data with a spatial and/or temporal component, in order to represent it in a system; semantics, to express a processing logic in language; and realisation, i.e. to translate this language into an efficient algorithm for processing queries.
This processing, or query evaluation, is very often accompanied by optimisation techniques when large amounts of data are involved. These techniques consist of defining indexes or access methods - i.e. data structures - that act as query accelerators and define efficient filters. "But when the data is complex, that is, when it's not numerical or textual but rather spatial, you have to rethink the whole model." In her PhD thesis, the lecturer defined a new topological and indexing model, namely an extension of the R-Tree data structure. "It was a new subject, where we converged the field of computer-aided design dedicated to cartography with semantic database management using spatial reasoning." These subjects are of particular interest to the humanities and social sciences, as well as to archaeologists, architects and urban planners.
Spatial data mining
In 1998, Karine Zeitouni presented her work at a conference organised by the computer science department of Simon Fraser University in Vancouver. It was there that she discovered data mining. It involves transforming data into useful information by establishing relationships between it. "This approach was quite innovative at the time. I was fascinated by it, so I started a workshop on this topic in the context of spatial databases, to which I invited international pioneers." Addressed for the first time in France thanks to the lecturer, this new science applied to spatial data was the subject of a special edition of the international journal Géomatique. Karine Zeitouni has since applied this approach to all her work on trajectories and temporal data from sensors.
Astronomical data management
From 2012 to 2018, she expanded its scope to include astronomical data. Initially conceived for two-dimensional data in the framework of geographic information systems, her work then extended to three-dimensional data, and then to spatiotemporal trajectories typically resulting from GPS tracks. She was asked to apply them by the Paris Observatory as part of the Mastodons challenges launched by the CNRS - a programme developed to explore new interdisciplinary approaches - in order to study the optimisation of requests for the selection of specific areas of space in a large volume of observational data of the Universe. "The specific problem with this spatial data, but also the technological advances of big data, have required new methods of query optimisation." Karine Zeitouni developed the ASTROIDE optimisation software, for example. It implements several essential queries in astronomical exploration, such as cone selection, adapted to the spatial data of the Universe. Also the Cross match - an algorithm 6,000 times faster than a non-optimised solution - to differentiate, within the framework of space missions, objects already mapped from new objects.
A multi-functional career
The lecturer holds multiple positions within the DAVID laboratory. She is responsible for the Ambient data access and mining (ADAM) team. Since the beginning of her career, she has supervised 17 PhD candidates and is still currently supervising five. Most of them work in the framework of national or international projects, which Karine Zeitouni leads or co-leads. She also co-manages, with the Strasbourg Astronomical Observatory, the BigData4Astro working group of the Masses de données, informations et connaissances en sciences (Data sets, information and knowledge in science - MaDICS) research group, which combines data management, data mining and machine learning.
The Deputy Director of Education of the Computer Sciences Graduate School
In March 2021, Karine Zeitouni took on new responsibilities, becoming Deputy Director of Education of the Computer Sciences Graduate School of Université Paris-Saclay. This entity includes 15 paths in computer science and two in bioinformatics, and two Master's degrees in computer methods applied to business management (MIAGE). "I accepted this position enthusiastically because I believe in the future of Université Paris-Saclay and I am very eager to contribute to its construction."
She is involved in three major challenges: to make the Graduate School's study paths clearer in order to attract and facilitate student enrolment. But also to increase its appeal to the best foreign students by awarding them scholarships, and by exploiting or establishing strategic agreements in the framework of international exchange programmes. These include the National Institute of Informatics (NII) in Tokyo (Japan), Macquarrie University in Sydney (Australia), and the University of Chennai (India). Lastly, to encourage students, who are often tempted by the attractive salaries in the private sector, to pursue their careers in public research. "I want to show them that the academic field offers many opportunities for intellectual and cultural discovery. But also that the freedom and autonomy of scientific work are priceless!"