CSE Spring Departmental Demo Day

We’re thrilled to invite you to the Fifth, newly bi-annual Comp. Sci. & Eng. Spring Demo Day. Student groups from several CSE capstone classes will be presenting the culmination of 3-months of effort, hard work, (metaphorical) blood, sweat (well… caffeine really), and tears (see above).

Where:

Davis Hall; 1st Floor Atrium

When:

Friday December 6, 2019

Schedule

12:30 PM - Staff Arrives

1:00 PM - Participants Arrive [Setup starts for participants and sponsors]

2:00 PM - Networking for participants and Judges

2:30 PM - Demo Day opens to public

3:00 PM - CSE 4/562 Databake-Off

4:30 PM - Breakdown and Judging Tabulated, shift into 101

5:00 PM - Prizes awarded, Teams give their pitch to audience

6:00 PM - Demo Day Ends

Acknowledgements

Thanks to everyone (Participants, Sponsors and Guests) for a hugely successful demo day!

Awards

First Place

Vehicle Trim Text Extraction

Pranav Vij, Saurab Chauhan, Nikhil Lala, Pranjal Jain

Second Place

Crowdsource Data Reviews and Events Calendar

Saranya Illa, Amanda Pellechia, Sowmith Nallu, Alan Romano, Venkatesh Viswanathan

Third Place

Choreographic Lineage

Amit Bannerjee, Miki Padhiary, Yogesh Sawant, Shreyas Rajguru
OneDataShare - Cross Platform Mobile Client

Linus Castelino, Atul Kumar Singh, Harsh Gandhi

Data-Bake-Off First Place

slow mo guys

Hariprasath Parthasarathy, Syed Aqhib Ahmed, Mohammad Umair

Data-Bake-Off Second Place

Lannisters

Srinivas Rishindra Pothireddi, Lakshmi Narasimhavihari Vemuri, Sri Harsha
Lone Wolf

Varsha Ganesh

Presented Projects

This year’s participating classes and projects include:

Computational Linguistics (CSE 467/LIN 667)

French MWE and their sturctural differences in treebanks

Erin Pacquetet

In this study, I will explain what structural differences remain in French Treebanks after being converted to the ConLL-U format and what implications do these differences have. In particular, I will look at how Multi-Word expressions are encoded in these treebanks and what those differences mean in terms of dependency analysis. Finally I will present how unified treebanks could potentially be merged to increase the accuracy of automatic dependency parsing.

IS DDI purely NLP?

Chinmay P Swami

Drugs play an important role in treating diseases which are innocuous to those which are extremely noxious. However, administering the right drug is pivotal to the revitalization of the patient?s health. Administering wrong combination of drugs has ramifications that can be harmless or can also be life threatening. Hence having a system that when, presented with two drugs, can notify whether the two drugs when given together can cause harm to the patient would certainly improve the quality of patient care. Going one step ahead the system could also predict the type of interaction that would indicate the severity of side effects caused. In this paper we would be leveraging machine learning techniques to create a model which will learn from the sentences that describe various kinds of drug drug interactions and would predict the type of interaction that would happen between the two drugs when administered together. Also along the way we would be answering the question of whether DDI is purely a NLP task or not with the help of multiple experiments. We would also be using domain knowledge for example the information pertaining to molecular structure of the drug to enhance the predicting capabilities of the model.

Dependency Parsing with NULL elements

Shruthi Shyam Rao

In linguistics, every element in a sentence plays a significant role. If there is an element missing in a sentence, we introduce a NULL element in that spot to represent that the sentence is not fully complete. This is done for better accuracy and understanding of the sentence. Hence, here we are performing dependency parsing with NULL elements. For this, we used the Penn treebank dataset. In Penn treebank dataset, there are a variety of null elements like *T*, NP*, 0, *U*, *?* etc. The null elements and their dependencies were understood from the dataset and they were introduced into the dependency tree structure without disturbing the dependencies of the elements that were already present in the tree structure.

Intricacies of Dozat-Manning's Parsing Algorithm

Wenqi Li

This paper details non-projective dependency parsing algorithm described in \citet{dozat-manning:2017:ICLR}. While it has been known for the graph-based parsing model using Chu-Liu-Edmonds algorithm \citep{edmonds:1967}, it uses a greedy algorithm. This algorithm is more like based on Tarjan's algorithm for Strongly Connected Components \citep{tarjan:1972}. Dozat-Manning parser treats the probabilities of occurrence of word dependencies learned from training data as a directed graph. The underlying idea is to identify cycles in a graph using Tarjan's algorithm and then using a greedy approach of replacing edges from vertices in the cycle with next maximum edge until the cycle is broken. Different from Chu-Liu-Edmonds, Dozat-Manning is not expanding or collapsing edges which result in cycle, rather outgoing edges from vertices in the cycle are chosen based on their weight and are used as a replacement until the cycle is broken. The dependency graph probabilities so generated are then used to deduce the parse output.

Improving Machine Translation Results by Corpus Filtering for a Low-Resource Language

Anthony Rubin

(submitted to ACL SRW 2019, under review) In this paper, we discuss the implementation of a method to reduce noise in a parallel corpus. A compositionality method based on cosine similarity as well as basic features to filter the corpus are used with a classifier to predict good sentence translation pairs within a parallel corpus. The purpose of this is to clean a noisy parallel corpus to the point that it can be used to accurately train a machine translation model. The accuracy of our resulting machine translation models was quantified using a BLEU score, and we improved results progressively by our proposed method.

Correlation between Fluency and Accuracy in Learner Corpus (submitted to ACL SRW 2019, under review)

Apurva Patil
Maggie Liu

Developing proficiency levels for non-native speakers has always been a difficult task. In this study, we explore the NUS learners corpus \citep{dahlmeier-ng-wu:2013:BEA8} and its annotated grammatical errors to automatically assign proficiency levels to individual participants by using fluency and accuracy of the learners' text. We determine the upper bound and lower bound of learners by using statistical measurement and classify learners into levels of accuracy and fluency. We conclude that perplexity for fluency is more correlate to accuracy compared to other fluency metrics in previous works. In addition, this perplexity fluency metric is more effective in predicting proficiency.

Data Augmentation for Grammatical Error Correction

Mengyang Qiu
Xuejiao Chen

Neural machine translation (NMT) approaches have shown to be promising in grammatical error correction (GEC). However, the lack of high-quality training data in GEC is one major issue for NMT. The current study explores various ways of generating pseudo GEC data by focusing on real grammatical errors and their surrounding context. Fluency filtering based on language models is also incorporated to ensure the quality of artificial error generation.

Masters Project Development (CSE 611)

Trust Zone Computing in Mobile Applications

Sidharth Mishra

This project extends the OP-TEE operating system to work on the hikey 960 board, and also explores some aspects of augmented reality computing in the trust zone.

Willo Mobile App

Shivam Agarwal
Shishir Suvarna
Deepak Sreenivasa
Krishna Parvathala

This project builds out a cross platform mobile and web application as a proof of concept for the Willo startup. The app allows users to easily create and maintain a will through their smartphone.

Willo Back Office

Aditya Agarwal
Yasha Ballal
Trishala Kaushik
Bhagyashri Thorat

This project builds out the back office and API layer of the Willow startup. This allows for account management, financial tracking, customer relations management, and creation of will documents from flexible templates.

Action Recognition on Android

Mrinalini Upadhya
Yash Narendra Saraf

This project seeks to ascertain if, and to what extent, action recognition is possible performed locally on an android device.

Invenst Automation

Shuo Zhang
Xushuang Liu
Yuchen Zhang
Lei Chen

This project seeks to convert the manual effort in maintaining the Invenst club infrastructure into a data driven web application, adding new features such as user profiles, skills, and dynamic approval flows and updates for projects in the idea bank.

Vehicle Trim Text Extraction

Pranav Vij
Saurab Chauhan
Nikhil Lala
Pranjal Jain

This project seeks to combine iOS app development with text recognition and machine learning to pull vehicle trim data from images of cars collected in real time.

Choreographic Lineage

Amit Bannerjee
Miki Padhiary
Yogesh Sawant
Shreyas Rajguru

This project builds on a system for dancers and other artists to contribute their professional relationships to a central datastore, and provides a networked visualization and profile of artists for researchers and dance enthusiasts to search.

Auto Transcription

Harshal Jagtap
Shubhra Deshpande
Ved Valsangkar
Smrati Singh
Rajat Thosar

This project transcribes design conversations in real time, and applies keyword tagging and sentiment analysis that can be used to categorize patterns of discussion and thought in design meetings.

Electric Vehicle Infrastructure Planning

Kavi Sanghavi
Saiyam Shah
Krishna Sehgal
Tanmay Singh

This project builds out a proof of concept for a local startup company helping define the best locations for charging stations for electric vehicles.

Onboard Diagnostics Text Extraction

Srinath Vikramakumar
Ruturaj Molawade
Yash Mali
Sai Krishna Uppala

This project combines text extraction and image categorization to both sort through images from cars for those that contain onboard diagnostics information, and extract that information in a textual format to an iOS app.

Crowdsource Data Reviews and Events Calendar

Saranya Illa
Amanda Pellechia
Sowmith Nallu
Alan Romano
Venkatesh Viswanathan

This project provides a mechanism for the Spectrum to crowdsource reviews of large datasets to the public, with each volunteer reviewing a small piece of the whole and indicating if they think that there is value in investigating further. It also creates an event calendar for exciting events on and off campus.

OneDataShare - Cross Platform Mobile Client: OneDataShare is a managed file transfer system that enables a user or a group of users to perform interprotocol file transfers using a cloud-based solution that guarantees reliability, efficiency and security of data transfer. This project is aimed at extending the functionalities provided by the OneDataShare web client to a mobile application using a cross platform application development framework.

UB ANC Emulator Upgrade

Arun Suresh
Hariprasath Parthasarathy

The UB-ANC Emulator is an emulation environment created to design, implement, and test various applications (missions) involving one or more drones in software, and provide seamless transition to experimentation. In this project, to model the effect of interference, packet losses, and protocols on network throughput, latency, and reliability we have integrated EMANE network emulator into UB- ANC

Distributed Music Player

Jon Battiston

Create a mesh network of devices that can discover and stream music to one another via a shared global playlist.

Tire Data Extraction

Akshay Verma
Adityan Harikrishnan
Roshni Murali

This project combines an iOS app with text extraction to read key data from the sides of automobile tires from images taken from the app camera.

Platter Restaurant App

Shubham Gulati

This project creates a proof of concept mobile app for startup Platter, which creates a subscription system for diners looking for discounts.

Applied NLP and Computational Social Science (CSE 702)

A Replication of Language Understanding for Text-based Games using Deep Reinforcement Learning

Yuhao Du

A Replication of Fake news on Twitter during the 2016 U.S. presidential election

Aamir Masood
Sanjay B.

Seq2Seq machine translation and gender bias analysis

Payraw Salih
Parth Shah

A Replication of Reducing Gender Bias Amplification using Corpus-level Constraints

Nishi Mehta
Pratik Kubal

A Replication of "From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews"

Dipannita Adhikary
Neeraj Abhyankar

A Replication of Exploring identity usage patterns in tweets and profile descriptions

Arjunil Pathak
Gokul Premaj

A Replication of “How Old Do You Think I Am?”: A Study of Language and Age in Twitter

Anish Gadekar
Shruti Bendale
Chris Chan

Independent Study

Anonymous public perception - An analysis of r/RoastMe and r/ToastMe

Niharika Raut
Gokul Premraj

Fake news in public WhatsApp groups during the 2019 Indian Election

Abhishek Bhave
Sidharth Pati
Ateendra Ramesh

What's influencing the president?

Akshada Chandrakant Bhor
Ruturaj Tukaram Molawade

SAM Research Database

Blake Cooper
Alex Liu
Lisa Kanbur

A sample-tracking database for the SAM Environmental Health Longitudinal Study.

Database Systems - Databake-Off (CSE 4/562)

Lone Wolf

Varsha Ganesh

This database once punched a table so hard that it became the first column store.

Neflix and Chill

Thejasweni Prakash Mysore
Anunay Rao
Apoorva Biseria

This database doesn't need indexes. It just intimidates tuples off the IO path.

Databass

Deepak Ranjan
Swarit Sanjiv Shah
Yash Narendra Saraf

Built using a volcano style implementation, this database buries its competition under a flow of molten tuples.

Megalodons

Aditya Agarwal
Shubham Gulati
Ram Vallabh Singh.

This database once computed π exactly.

slow mo guys

Hariprasath Parthasarathy
Syed Aqhib Ahmed
Mohammad Umair

Not actually a database, but actually a quantum simulation of all possible computer programs.

Lannisters

Srinivas Rishindra Pothireddi
Lakshmi Narasimhavihari Vemuri
Sri Harsha

A Lannister always pays his DBAs.

Past Demo Days

Fall 2016 Fall 2017 Fall 2018 Fall 2019 Spring 2019 Fall 2020 Fall 2021 Spring 2021 Fall 2022 Spring 2022 Fall 2023 Spring 2023 Fall 2024 Spring 2024 Fall 2025 Spring 2025 Spring 2026