Data series (a.k.a. sequences, or time series) are present in virtually every scientific and social domain: from health care, astronomy and biology, to finance and the internet-of-things.
In astronomy, there are applications with more than 70TB of spectroscopic sequence data, while by 2025 scientists are expected to collect around 2-40 ExaBytes of DNA sequence data.
Our research aims to change a landscape, where database systems are used merely for storing and retrieving data, by enabling scientists to transparently use specialized query processing systems for accessing their sequential data.
NESTOR uses specialized summarization techniques for both reducing the size of data series, but also for allowing blazing fast analytics. It additionally allows for the construction of domain specific indexes and decide when to use them by performing access path selection. Such indexes facilitate both analytical (such as similarity search) as well as aggregation queries.
Both data storage, indexing, as well as query processing can scale to large clusters of computing nodes, allowing both for multi-TB data processing but also for large analytical jobs to be performed in seconds.
NESTOR's storage layer continuously and adaptively reorganizes the underlying data layout in order to match the current workload, without incurring any additional overhead.
We utilize all modern hardware optimization techniques such as SIMD, NUMA-aware multi-processing, GPUs and SSD optimizations.
We have developed the current state-of-the-art data series indexes, iSAX2+ (bulk loading), ADS+ and Dumpy (adaptive), DPiSAX and Odyssey (distributed), ParIS+ and Hercules (multi-core), SING (GPU), MESSI and Elpis (in-memory), Coconut-LSM (streaming series), ULISSE (variable-length), and ProS (progressive query answering) the first data series query workload benchmark, as well as DSStat, a toolset for data series preprocessing and visualization.
We have applied our techniques on streaming and uncertain data series, and have worked with data from diverse domains, such as home networks, road tunnels, seismology, neuroscience, astrophysics, manufacturing, as well as from deep learning embeddings.
Extensive experimental evaluations demonstrate that our techniques are the state-of-the-art for exact search and approximate search with quality guarantees, and the only viable solution for disk-resident datasets for both data series and general high-dimensional vector datasets.
Moreover, we have developed unsupervised methods for subsequence anomaly detection: NormA and Series2Graph (offline), and SAND (online). These methods exhibit state-of-the-art performance across a variety of dataset characteristics and anomaly types, without the need to learn from domain knowledge, labeled data, or datasets clean from anomalies.
In our tutorials we describe the most prevalent similarity search methods developed in both the data series and the high-dimensional communities, and comment on their merits and drawbacks. We present recent results from extensive experiemntal comparison studies, which demonstrate the superiority of the state-of-the-art data series methods. We also present and discuss the state-of-the-art methods in data series analytics, and subsequence anomaly detection in particular.
Paul Boniol, John Paparrizos, Themis Palpanas.
EDBT 2023Karima Echihabi, Themis Palpanas.
MDM 2022Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
VLDB 2021Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
ICDE 2021Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
EDBT 2021Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
IEEE BigData 2020Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
ISCC 2020There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of sequences (a.k.a. data series, or time series). Examples of such applications come from biology, astronomy, entomology, the web, and other domains.
Anthony Bagnall, Richard L. Cole, Themis Palpanas, Kostas Zoumpatianos.
Dagstuhl Reports 2019Themis Palpanas and Volker Beckmann.
SIGMOD Record 2019Kostas Zoumpatianos, Stratos Idreos, Themis Palpanas.
NEDB 2019Kostas Zoumpatianos, Themis Palpanas.
ICDE 2018Themis Palpanas.
HPCS 2017Themis Palpanas.
ICDE 2016Themis Palpanas.
LNCS 2016Themis Palpanas.
SIGMOD Record 2015For big data exploration, it is prohibitive to rely to full sequential scans for every single query, and therefore, indexing is required. The target of our indexing techniques is to make query processing efficient enough, such that the analysts can repeatedly fire several exploratory queries with quick response times and low initialization costs.
Panagiota Fatourou, Eleftherios Kosmas, Themis Palpanas, George Paterakis.
SRDS 2023Zeyu Wang, Qitong Wang, Peng Wang, Themis Palpanas, Wei Wang.
SIGMOD 2023Ilias Azizi, Karima Echihabi, Themis Palpanas.
PVLDB 2023Manos Chatzakis, Panagiota Fatourou, Eleftherios Kosmas, Themis Palpanas, Botao Peng.
PVLDB 2023Karima Echihabi, Panagiota Fatourou, Kostas Zoumbatianos, Themis Palpanas, Houda Benbrahim.
PVLDB 2022Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos.
TKDE 2022Qitong Wang (supervised by: Themis Palpanas).
VLDB PhD Workshop 2022Botao Peng, Panagiota Fatourou, Themis Palpanas.
ICDE 2021Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos.
EDBT 2021Botao Peng, Panagiota Fatourou, Themis Palpanas.
VLDBJ 2021Oleksandra Levchenko, Boyan Kolev, Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseflia, Themis Palpanas, Dennis Shasha, Patrick Valduriez.
KAIS 2020Michele Linardi, Themis Palpanas.
VLDBJ 2020Themis Palpanas.
CCIS 2020Botao Peng, Panagiota Fatourou, Themis Palpanas.
TKDE 2020Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Themis Palpanas
TKDE 2020Botao Peng (supervised by Panagiota Fatourou, Themis Palpanas)
ICDE (PhD Workshop) 2020Haridimos Kondylakis, Niv Dayan, Kostas Zoumpatianos, Themis Palpanas.
SIGMOD 2019Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas, Houda Benbrahim.
PVLDB 2020Botao Peng, Panagiota Fatourou, Themis Palpanas.
ICDE 2020Karima Echihabi (supervised by Themis Palpanas and Houda Benbrahim).
VLDB PhD Workshop 2019Michele Linardi (supervised by Themis Palpanas).
VLDB PhD Workshop 2019Haridimos Kondylakis, Niv Dayan, Kostas Zoumpatianos, Themis Palpanas.
VLDBJ 2019Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos.
SIGSPATIAL 2019Oleksandra Levchenko, Boyan Kolev, Djamel-Edine Yagoubi, Dennis Shasha, Themis Palpanas, Patrick Valduriez, Reza Akbarinia, Florent Masseglia.
ECML/PKDD 2019Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos.
SSTD 2019Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas, Houda Benbrahim.
PVLDB 2019Michele Linardi, Themis Palpanas.
PVLDB 2019Kostas Zoumpatianos, Yin Lou, Ioana Ileana, Themis Palpanas, Johannes Gehrke.
VLDBJ 2018Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Themis Palpanas.
TKDE 2018Botao Peng, Themis Palpanas, Panagiota Fatourou.
IEEE BigData 2018Haridimos Kondylakis, Niv Dayan, Kostas Zoumpatianos, Themis Palpanas.
PVLDB 2018Michele Linardi, Themis Palpanas.
ICDE 2018Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Themis Palpanas.
ICDM 2017Kostas Zoumpatianos, Stratos Idreos, Themis Palpanas.
VLDBJ 2016Kostas Zoumpatianos, Yin Lou, Themis Palpanas, Johannes Gehrke.
KDD 2015Alessandro Camerra, Jin Shieh, Themis Palpanas, Thanawin Rakthanmanon, Eamonn Keogh.
KAIS 2014Kostas Zoumpatianos, Stratos Idreos, Themis Palpanas.
SIGMOD 2014Alessandro Camerra, Themis Palpanas, Jin Shieh, Eamonn Keogh.
ICDM 2010Eamonn Keogh, Themis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, Marc Cardle.
VLDB 2004Examples of analysis operations are queries by content (range and similarity queries, nearest neighbors), clustering, classification, outlier patterns, frequent sub-sequences, and others.
Emmanouil Sylligardos, Paul Boniol, John Paparrizos, Panos Trahanias, Themis Palpanas.
PVLDB 2023Adrien Petralia, Philippe Charpentier, Paul Boniol, Themis Palpanas.
e-Energy 2023Paul Boniol, Mohammed Meftah, Emmanuel Remy, Themis Palpanas.
SIGMOD 2022Qitong Wang, Stephen Whitmarsh, Vincent Navarro, Themis Palpanas.
PVLDB 2022Alae Eddine El Hmimdi, Lindsey M Ward, Themis Palpanas, Vivien Sainte Fare Garnot, Zoi Kapoula.
BrainSci 2022John Paparrizos, Paul Boniol, Themis Palpanas, Ruey S. Tsay, Aaron Elmore, Michael J. Franklin.
PVLDB 2022Paul Boniol, John Paparrizos, Yuhao Kang, Themis Palpanas, Ruey Tsay, Aaron J. Elmore, Michael J. Franklin.
PVLDB 2022John Paparrizos, Yuhao Kang, Ruey Tsay, Paul Boniol, Themis Palpanas, Michael J. Franklin.
PVLDB 2022Alae Eddine El Hmimdi, Lindsey M Ward, Themis Palpanas, Zoi Kapoula.
BrainSci 2021Paul Boniol, John Paparrizos, Themis Palpanas, Michael J. Franklin.
PVLDB 2021Paul Boniol, John Paparrizos, Themis Palpanas, Michael J. Franklin.
PVLDB 2021Pauline Laviron, Zueqi Dai, Berenice Huquet, Themis Palpanas.
e-Energy 2021Paul Boniol, Michele Linardi, Federico Roncallo, Themis Palpanas, Mohammed Meftah. Emmanuel Remy.
VLDBJ 2021Paul Boniol, Themis Palpanas, Mohammed Meftah, Emmanuel Remy.
PVLDB 2020Paul Boniol, Themis Palpanas.
PVLDB 2020Paul Boniol (supervised by Themis Palpanas, Mohammed Meftah, Emmanuel Remy).
VLDB PhD Workshop 2020Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas.
WIMS 2020Paul Boniol, Michele Linardi, Federico Roncallo, Themis Palpanas.
ICDE 2020Paul Boniol, Michele Linardi, Federico Roncallo, Themis Palpanas.
ICDE 2020Michele Linardi, Yan Zhu, Themis Palpanas, Eamonn Keogh.
DAMI 2020Michele Linardi, Yan Zhu, Themis Palpanas, Eamonn Keogh.
SIGMOD 2018Michele Linardi, Yan Zhu, Themis Palpanas, Eamonn Keogh.
SIGMOD 2018Katsiaryna Mirylenka, Michele Dallachiesa, Themis Palpanas.
SSDBM 2017Katsiaryna Mirylenka, Michele Dallachiesa, Themis Palpanas.
EDBT 2017Novri Suhermi, Judit Gervain, Themis Palpanas.
fNIRS 2016Katsiaryna Mirylenka, Vassilis Christophides, Themis Palpanas, Ioannis Pefkianakis, Martin May.
EDBT 2016Katsiaryna Mirylenka, Alice Marascu, Themis Palpanas, Matthias Fehr, Stefan Jank, Gunter Welde, Daniel Groeber.
APC|M 2013Katsiaryna Mirylenka, Themis Palpanas, Graham Cormode, Divesh Srivastava.
ICDE 2013Alice Marascu, Suleiman Ali Khan, Themis Palpanas.
PAKDD 2012Themis Palpanas.
Springer 2012Using our techniques, users can explore large datasets and find patterns of interest, using nearest neighbor search. They can draw queries (data series) using a mouse, or touch screen, or they can select from their own datasets.
Karima Echihabi, Theophanis Tsandilas, Anna Gogolou, Anastasia Bezerianos, Themis Palpanas.
VLDBJ 2022Anna Gogolou, Theophanis Tsandilas, Karima Echihabi, Anastasia Bezerianos, Themis Palpanas.
SIGMOD 2020Anna Gogolou, Theophanis Tsandilas, Themis Palpanas, Anastasia Bezerianos.
BigVis@EDBT 2019Anna Gogolou, Theophanis Tsandilas, Themis Palpanas, Anastasia Bezerianos.
TVCG 2019Anna Gogolou, Theophanis Tsandilas, Themis Palpanas, Anastasia Bezerianos.
IEEE VIS 2018Kostas Zoumpatianos, Stratos Idreos, Themis Palpanas.
VLDB 2015In order to support time- and space-efficient management and analytics, data series need to be summarized. Different summarization techniques are applicable to different applications and problem settings.
Qitong Wang, Themis Palpanas.
TKDE 2023Qitong Wang, Themis Palpanas.
KDD 2021Usman Raza, Alessandro Camerra, Amy L. Murphy, Themis Palpanas, Gian Pietro Picco.
TKDE 2015Usman Raza, Alessandro Camerra, Amy L. Murphy, Themis Palpanas, Gian Pietro Picco.
Best Paper Award
PerCom 2012Themis Palpanas.
Springer 2012Themis Palpanas, Michail Vlachos, Eamonn Keogh, Dimitrios Gunopulos.
TKDE 2008Themis Palpanas, Michail Vlachos, Eamonn Keogh, Dimitrios Gunopulos, Wagner Truppel.
ICDE 2004Modeling tuples with value and existential uncertainty has several advantages. From an engineering perspective, a programmer can feed uncertain data directly into the system, without explicitly preprocessing data and forcing data approximations. From an application requirements perspective, maintaining possible values allows the application to provide results with confidence intervals.
Michele Dallachiesa, Gabriela Jacques-Silva, Bugra Gedik, Kun-Lung Wu, Themis Palpanas.
KAIS 2015Michele Dallachiesa, Themis Palpanas, Ihab F. Ilyas.
VLDB 2015Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, Themis Palpanas.
VLDB 2012Michele Dallachiesa, Besmira Nushi, Katsiaryna Mirylenka, Themis Palpanas.
QUeST @ GIS 2011