Title: Automatic data-driven schema evolution Speaker: Nikolaos Trogkanis (UCSD) Abstract: Careful database design enforces constraints that the stored data must satisfy, either implicitly by the schema structure or by explicit constraints. However, average users (non database experts) store data in flat tables with no constraints (e.g. spreadsheets). Furthermore, database designers often overlook some constraints either because of their lack of domain expertise or because the constraints might not be a priori known. Therefore the schema and constraints are unknown or incomplete. Hence, there is a need to redesign the database to find a model that better fits the data (i.e. more constraints, less information capacity). We propose a system that can automatically evolve a database schema to a "better" one by inferring the missing constraints from the data. It consists of three main stages: Data Mining, Schema Evolution and Data Evolution. The Data Mining stage mines the missing constraints that a given instance of the schema satisfies. We discuss prior work coming from data mining in the area of functional dependency mining and we discuss how it needs to be improved for data-driven schema evolution purposes. The Schema Evolution stage uses the missing constraints mined from the previous stage to evolve the schema. The last stage migrates the data to the new schema and keeps compatibility for existing applications. In the talk we will discuss the various parts of the above system, some of the previous work that has been done and novel directions we investigate.