CSE Fall Departmental Demo Day

We’re thrilled to invite you to the third anual Comp. Sci. & Eng. Fall Demo Day. Student groups from several CSE capstone classes will be presenting the culmination of 3-months of effort, hard work, (metaphorical) blood, sweat (well… caffein really), and tears (see above).

Where:

Davis Hall, 1st Floor Atrium

When:

Friday Dec. 8; 12 PM - 5 PM.

## Photos [Thanks to Ken Smith](https://kensmith.smugmug.com/University-at-Buffalo/CSE-Events/Demo-Day-12-07-2017/i-5tXkFH3) ## Awards #### Best Presentation * **Just-in-Time Datastructure Modeling**: *Darshana Balakrishnan* #### Most Impactful * **Vanir: Probabilistic Databases on Spark**: *Nicholas Cellino, William Spoth* * **Differentially Private Sparse Inverse Covariance Estimation**: *Mengdi Huai, Di Wang* ## Judges * Mike Buckley * Karthik Dantu ## Presented Projects This year's participating classes and projects include:

Languages and Runtimes for Big Data (CSE 662)

Vanir: Probabilistic Databases on Spark

Nicholas Cellino
William Spoth

Most data is unusable when directly collected, due to either user or system error, to convert this messy data into us- able data it takes time, money, and Advil. Probabilistic databases aim to streamline this cleaning process but often have a large computation or data cost associated. Our solution is to leverage the distributed computing engine Apache Spark, to mitigate the overhead associated with probabilistic databases.

Optimizer for Sampling Queries

Vandit Aruldas
Shivang Aggarwal
Sneha Krishnamurthy
Rakshit Muthappa Padetira

Probabilistic database systems aim to produce all possible results. Query Sampling generates samples of possible results. The database is split into a fixed number (N) of possible worlds and the query is run on all N possible worlds in parallel by representing data in one of several ways. Using a cost model to predict runtimes allows us to decide on the fastest samplingstrategy.

Just-in-Time Datastructure Modeling

Darshana Balakrishnan

The performance of a Database Management System is closely coupled to the index structures it uses, making index selection extremely important. This project supplements an existing generic datastructure framework called just-in-time data structures with a simulation + cost-analysis-based approach to derive policies for data organization.

Multi-Dimensional Cracking

Anna Jonet Joseph
Anand Sankar Bhagavandas

All the Spatial Indexing techniques being used today, including, but not limited to, R-Tree, R+ Tree, Quad Tree and Grid require prior knowledge about the data and query workloads. Cracking involves physically reorganizing the data by dividing the database into manageable pieces based on the incoming query workload. A cracker index, R+ Tree in our case, is created on the fly for cracked pieces and the later queries are answered using this. Prior works on Database Cracking was purely in one dimensional data, here we extend it to multi – dimensional database.

Cluster-Friendly Spatial Indexing

Nikita Ganesh Konda

The increasing size and density of spatial data led to the use of NoSQL databases. However, efficient indexing of spatial data in NoSQL database is hard. Currently, the use of space filling curves for indexing is widely adopted. However, these partition the entire space uniformly, which may not be efficient for regions with sparsely populated data. The aim of this project is to eliminate empty spaces by building a two-tier index with R-tree as global index and space filling curve as local index.

Pattern Detection For Query Explanations

Deepti Chavan
Shruti Parab
Sushmita Sinha

The high level aim of this project is to find the correlations in the dataset and to optimize the process of finding the same. Using these correlations as constraints, find possible reasons justifying or identifying the presence of an outlier. We intend to determine the type in which value attributes are associated with dimension attributes.

Differential Privacy (CSE 660)

The comparison of two algorithms on private causal Inference

Liuyi Yao

Verification of Randomized Response in Coq

Weihao Qu

Reconstruction of a Database using Fourier attack

Sindhu Madhuri Morapakala

An Implementation of Differentially Private Bayesian Inference

Jiawen Liu

An implementation of the DualQuery algorithm

Shubham Shekhar Lagwankar
Muhammed Zaki Muhammed Husain Bakshi

Locally Differentially Private Protocols for Frequency Estimation

Venkata Gayatri Pratyusha Gundugola
Manish Kasireddy

Differentially Private Sparse Inverse Covariance Estimation

Mengdi Huai
Di Wang

Past Demo Days

Fall 2016 Fall 2017 Fall 2018 Fall 2019 Spring 2019 Fall 2020 Fall 2021 Spring 2021 Fall 2022 Spring 2022 Fall 2023 Spring 2023 Fall 2024 Spring 2024 Spring 2025