Software Prototypes

DataMingler: A Novel Approach to Data Virtualization

(collaboration with Damianos Chatziantoniou)

DataMingler is a novel tool under development that aims to provide all data stakeholders with a simple, graph-based, conceptual data layer, built agilely based on the data of multiple and variable data sources. DataMingler implements a novel data virtualization paradigm that we call the "data virtual machine" (DVM), which builds a data model bottom-up based on the available data and the target data processing tasks. An initial description of DVM can be found in Data Virtual Machines: Data-Driven Conceptual Modeling of Big Data Infrastructures. [In the EDBT Workshop Search, Exploration, and Analysis in Heterogeneous Datastores (SEAData), 2020] and for DataMingler in DataMingler: A Novel Approach to Data Virtualization.[In ACM International Conference of Special Interest Group on Management Of Data (SIGMOD) - Demo Track, 2021]. More information will be posted in http://www.datamingler.com

EnForce: A system for Automated Forcasting on Energy Data

(collaboration with Paris Kerasiotis and Mary Karatzoglidi)

EnForce is a novel system under development that provides fully automatic forecasting on time series data on the energy consumption of buildings. It uses statistical techniques and deep learning methods to make predictions on univariate or multivariate time series data, so that exogenous factors, such as outside temperature, are taken into account. Enforce provides automatic data preprocessing and handles noisy data, with missing values and outliers. An initial description of EnForce can be found in Automated energy consumption forecasting with EnForce. [In the Proceedings of Very Large Data Bases (PVLDB) - Demo, 2021]

SPARQL-Vision: A platform for querying SPARQL endpoints

(collaboration with Maria Krommyda)

SPARQL-Vision is a novel platform under development that aims to make SPARQL endopoints accessible to a wide audience that does not have knowledge of the RDF model, by enabling users to easily query, explore and visualize information contained in the endpoints. The platform provides case-specific visualization solutions for SPARQL results based exclusively on features extracted from the result.

Octopus: A System for Processing and Managing Big Time Series Data

(collaboration with Bamdad Mousavi and Paris Kerasiotis)

Octopus is a prototype system under development which aims to tackle the challenges posed by very big and diverse collections of time series data, possibly deriving from multiple, heterogeneous and dispersed sources. The focus is the design and implementation of techniques for the management of batch and streaming time series data in the same system, and methods to select the most appropriate technique depending on the input queries. Octopus has the capability to create pipelines for streaming time series data, perform computations over the streaming data and presents data analytics capabilities on recent as well as historical data.

IVGL: Interactive Visualization of Large Graphs

National Technical University of Athens
(collaboration with Maria Krommyda)

IVLG is a fully fledged prototype system that implements a novel technique for the visualization of very large graphs, enabling many users to have concurrently access to the information through a user-friendly interface. It allows the user to navigate the dataset through different levels of abstraction and locate information using innovative exploration techniques. A carefully designed storage schema along with an API that takes advantage of the appropriate indexing handles datasets with millions of elements without raising any performance issues, even when accessed from devices with limited computational resources.

PAW: The Platform for Analytics Workflows(2014-2018)

University of Geneva
(collaboration with Maxim Filatov)

This is a platform that implements our techniques for workflow optimization over multiple execution engines. We consider workflows that span a DBMS, MapReduce engines, and an orchestration engine. This configuration is emerging as a common paradigm used to combine analysis of unstructured data with analysis of structured data (e.g., NoSQL plus SQL). PAW performs workflow design, single workflow optimization, multiple workflow optimization and online workflow recalibration.

The ASAP system (2014-now)

University of Geneva
(collaboration with all partners in the EU FP7 ASAP project )

This is a full-fledged system which is the goal of the ASAP FP7 research project. This system implements a dynamic open-source execution framework for scalable data analytics, which can accommodate multiple execution models, suitable for a variety of types of tasks and data over multi-engine environments.

SLA Data Management (2012-2014)

University of Geneva
(collaboration with Katerina Stamou, Jean-Henry Morin)

This is a software prototype that will implement the graph-based SLA representation and manage SLA data employing RDF triples in AllegroGraph.

COCCUS: A Framework for Self-Configured Cost-Based Query Services on the Cloud (2012-2014)

National Technical University of Athens
(collaboration with Ioannis Konstantinou, Dimitrios Tsoumakos)

COCCUS is a system that implements a modular framework for cost-aware query execution, adaptive query charge and optimization of cloud data services. Queries along with their execution preferences and budget constraints are input to COCCUS, and the latter adaptively determines query charge and manages secondary data structures according to various economic policies.

Cloud Data Management System (2010-2011)

École Polytechnique Fédérale de Lausanne
(collaboration with Debabrata Dash, Adrian Popescu)

The system offers data management in the cloud. It executes in a parallel manner queries on multiple MySQL instances employing the MapReduce paradigm.

Cloud Economy Layer & Cloud Data Service Provider Simulator (2009-2011)

École Polytechnique Fédérale de Lausanne
(collaboration with Debabrata Dash, Sofia Kyriakopoulou)

This is a software layer that implements a data-aware economy model designed for the provision of cloud data services. The software takes as input user queries and communicates the query to a simulator of a cloud database.

GrouPeer System (2005-2012)

National Technical University of Athens
(collaboration with Giorgos Orfanoudakis, Dimos Bousounis)

GrouPeer achieves the dynamic clustering and grouping of heterogeneous autonomous databases that are the nodes of an unstructured overlay network (Peer-to-Peer databases), employing the propagation of queries in the network. GrouPeer consists of a software layer that communicates with a local PostgreSQL instance as well as with other instances of the same software. The website of the system can be found here.

SpatialP2P Simulator (2005-2008)

National Technical University of Athens

SpatialP2P is a simulator of a structured Peer-to-Peer overlay that manages geographical data.

Distributed Trigger Mechanism for P2P Databases (2004-2010)

National Technical University of Athens
(collaboration with Maher Manoubi)

This is a mechanism that enables the declaration and execution of distributed triggers between acquainted Peer-to-Peer databases. The mechanism is an extension of the centralized trigger mechanism of PostgreSQL.

Mobile P2P Database & Mobile P2P Network Simulator (2007-2008)

National Technical University of Athens
(collaboration with Konstantina Palla)

This is a simulator of a fast evolving Peer-to-Peer overlay that consists of mobile databases. The simulator is based on Network Simulator 2. Each mobile peer consists of a software layer that receives, sends and processes active rules. The software layer communicates with an underlying instance of TinyDB.

Rule Mechanism for P2P Databases (2002-2003)

University of Toronto

This is a mechanism that implements the distributed and parallel processing of active rules in a distributed system of autonomous databases. The goal of the mechanism is to minimize the number of exchanged messages between the databases involved in the processing of an active rule.

Application for the Creation of Virtual Spaces (1999-2000)

National Technical University of Athens

This application enables the creation of 3D spaces in a declarative manner. It implements the descriptive language STEDEL.