CSE Spring Departmental Demo Day

We’re thrilled to invite you to the Fifth, newly bi-annual Comp. Sci. & Eng. Spring Demo Day. Student groups from several CSE capstone classes will be presenting the culmination of 3-months of effort, hard work, (metaphorical) blood, sweat (well… caffeine really), and tears (see above).

  Where:

Davis Hall; 1st Floor Atrium

  When:

Friday December 6, 2019

Schedule

12:30 PM - Staff Arrives
1:00 PM - Participants Arrive [Setup starts for participants and sponsors]
2:00 PM - Networking for participants and Judges
2:30 PM - Demo Day opens to public
3:00 PM - CSE 4/562 Databake-Off
4:30 PM - Breakdown and Judging Tabulated, shift into 101
5:00 PM - Prizes awarded, Teams give their pitch to audience
6:00 PM - Demo Day Ends

Acknowledgements

Thanks to everyone (Participants, Sponsors and Guests) for a hugely successful demo day!

Sponsors

Awards

First Place

Second Place

Third Place

Data-Bake-Off First Place

Data-Bake-Off Second Place

Presented Projects

This year’s participating classes and projects include:

Computational Linguistics (CSE 467/LIN 667)

French MWE and their sturctural differences in treebanks
  • Erin Pacquetet
In this study, I will explain what structural differences remain in French Treebanks after being converted to the ConLL-U format and what implications do these differences have. In particular, I will look at how Multi-Word expressions are encoded in these treebanks and what those differences mean in terms of dependency analysis. Finally I will present how unified treebanks could potentially be merged to increase the accuracy of automatic dependency parsing.
IS DDI purely NLP?
  • Chinmay P Swami
Drugs play an important role in treating diseases which are innocuous to those which are extremely noxious. However, administering the right drug is pivotal to the revitalization of the patient?s health. Administering wrong combination of drugs has ramifications that can be harmless or can also be life threatening. Hence having a system that when, presented with two drugs, can notify whether the two drugs when given together can cause harm to the patient would certainly improve the quality of patient care. Going one step ahead the system could also predict the type of interaction that would indicate the severity of side effects caused. In this paper we would be leveraging machine learning techniques to create a model which will learn from the sentences that describe various kinds of drug drug interactions and would predict the type of interaction that would happen between the two drugs when administered together. Also along the way we would be answering the question of whether DDI is purely a NLP task or not with the help of multiple experiments. We would also be using domain knowledge for example the information pertaining to molecular structure of the drug to enhance the predicting capabilities of the model.
Dependency Parsing with NULL elements
  • Shruthi Shyam Rao
In linguistics, every element in a sentence plays a significant role. If there is an element missing in a sentence, we introduce a NULL element in that spot to represent that the sentence is not fully complete. This is done for better accuracy and understanding of the sentence. Hence, here we are performing dependency parsing with NULL elements. For this, we used the Penn treebank dataset. In Penn treebank dataset, there are a variety of null elements like *T*, NP*, 0, *U*, *?* etc. The null elements and their dependencies were understood from the dataset and they were introduced into the dependency tree structure without disturbing the dependencies of the elements that were already present in the tree structure.
Intricacies of Dozat-Manning's Parsing Algorithm
  • Wenqi Li
This paper details non-projective dependency parsing algorithm described in \citet{dozat-manning:2017:ICLR}. While it has been known for the graph-based parsing model using Chu-Liu-Edmonds algorithm \citep{edmonds:1967}, it uses a greedy algorithm. This algorithm is more like based on Tarjan's algorithm for Strongly Connected Components \citep{tarjan:1972}. Dozat-Manning parser treats the probabilities of occurrence of word dependencies learned from training data as a directed graph. The underlying idea is to identify cycles in a graph using Tarjan's algorithm and then using a greedy approach of replacing edges from vertices in the cycle with next maximum edge until the cycle is broken. Different from Chu-Liu-Edmonds, Dozat-Manning is not expanding or collapsing edges which result in cycle, rather outgoing edges from vertices in the cycle are chosen based on their weight and are used as a replacement until the cycle is broken. The dependency graph probabilities so generated are then used to deduce the parse output.
Improving Machine Translation Results by Corpus Filtering for a Low-Resource Language
  • Anthony Rubin
(submitted to ACL SRW 2019, under review) In this paper, we discuss the implementation of a method to reduce noise in a parallel corpus. A compositionality method based on cosine similarity as well as basic features to filter the corpus are used with a classifier to predict good sentence translation pairs within a parallel corpus. The purpose of this is to clean a noisy parallel corpus to the point that it can be used to accurately train a machine translation model. The accuracy of our resulting machine translation models was quantified using a BLEU score, and we improved results progressively by our proposed method.
Correlation between Fluency and Accuracy in Learner Corpus (submitted to ACL SRW 2019, under review)
  • Apurva Patil
  • Maggie Liu
Developing proficiency levels for non-native speakers has always been a difficult task. In this study, we explore the NUS learners corpus \citep{dahlmeier-ng-wu:2013:BEA8} and its annotated grammatical errors to automatically assign proficiency levels to individual participants by using fluency and accuracy of the learners' text. We determine the upper bound and lower bound of learners by using statistical measurement and classify learners into levels of accuracy and fluency. We conclude that perplexity for fluency is more correlate to accuracy compared to other fluency metrics in previous works. In addition, this perplexity fluency metric is more effective in predicting proficiency.
Data Augmentation for Grammatical Error Correction
  • Mengyang Qiu
  • Xuejiao Chen
Neural machine translation (NMT) approaches have shown to be promising in grammatical error correction (GEC). However, the lack of high-quality training data in GEC is one major issue for NMT. The current study explores various ways of generating pseudo GEC data by focusing on real grammatical errors and their surrounding context. Fluency filtering based on language models is also incorporated to ensure the quality of artificial error generation.

Masters Project Development (CSE 611)

Trust Zone Computing in Mobile Applications
  • Sidharth Mishra
This project extends the OP-TEE operating system to work on the hikey 960 board, and also explores some aspects of augmented reality computing in the trust zone.
Willo Mobile App
  • Shivam Agarwal
  • Shishir Suvarna
  • Deepak Sreenivasa
  • Krishna Parvathala
This project builds out a cross platform mobile and web application as a proof of concept for the Willo startup. The app allows users to easily create and maintain a will through their smartphone.
Willo Back Office
  • Aditya Agarwal
  • Yasha Ballal
  • Trishala Kaushik
  • Bhagyashri Thorat
This project builds out the back office and API layer of the Willow startup. This allows for account management, financial tracking, customer relations management, and creation of will documents from flexible templates.
Action Recognition on Android
  • Mrinalini Upadhya
  • Yash Narendra Saraf
This project seeks to ascertain if, and to what extent, action recognition is possible performed locally on an android device.
Invenst Automation
  • Shuo Zhang
  • Xushuang Liu
  • Yuchen Zhang
  • Lei Chen
This project seeks to convert the manual effort in maintaining the Invenst club infrastructure into a data driven web application, adding new features such as user profiles, skills, and dynamic approval flows and updates for projects in the idea bank.
Vehicle Trim Text Extraction
  • Pranav Vij
  • Saurab Chauhan
  • Nikhil Lala
  • Pranjal Jain
This project seeks to combine iOS app development with text recognition and machine learning to pull vehicle trim data from images of cars collected in real time.
Choreographic Lineage
  • Amit Bannerjee
  • Miki Padhiary
  • Yogesh Sawant
  • Shreyas Rajguru
This project builds on a system for dancers and other artists to contribute their professional relationships to a central datastore, and provides a networked visualization and profile of artists for researchers and dance enthusiasts to search.
Auto Transcription
  • Harshal Jagtap
  • Shubhra Deshpande
  • Ved Valsangkar
  • Smrati Singh
  • Rajat Thosar
This project transcribes design conversations in real time, and applies keyword tagging and sentiment analysis that can be used to categorize patterns of discussion and thought in design meetings.
Electric Vehicle Infrastructure Planning
  • Kavi Sanghavi
  • Saiyam Shah
  • Krishna Sehgal
  • Tanmay Singh
This project builds out a proof of concept for a local startup company helping define the best locations for charging stations for electric vehicles.
Onboard Diagnostics Text Extraction
  • Srinath Vikramakumar
  • Ruturaj Molawade
  • Yash Mali
  • Sai Krishna Uppala
This project combines text extraction and image categorization to both sort through images from cars for those that contain onboard diagnostics information, and extract that information in a textual format to an iOS app.
Crowdsource Data Reviews and Events Calendar
  • Saranya Illa
  • Amanda Pellechia
  • Sowmith Nallu
  • Alan Romano
  • Venkatesh Viswanathan
This project provides a mechanism for the Spectrum to crowdsource reviews of large datasets to the public, with each volunteer reviewing a small piece of the whole and indicating if they think that there is value in investigating further. It also creates an event calendar for exciting events on and off campus.
OneDataShare - Cross Platform Mobile Client
OneDataShare is a managed file transfer system that enables a user or a group of users to perform interprotocol file transfers using a cloud-based solution that guarantees reliability, efficiency and security of data transfer. This project is aimed at extending the functionalities provided by the OneDataShare web client to a mobile application using a cross platform application development framework.
UB ANC Emulator Upgrade
  • Arun Suresh
  • Hariprasath Parthasarathy
The UB-ANC Emulator is an emulation environment created to design, implement, and test various applications (missions) involving one or more drones in software, and provide seamless transition to experimentation. In this project, to model the effect of interference, packet losses, and protocols on network throughput, latency, and reliability we have integrated EMANE network emulator into UB- ANC
Distributed Music Player
  • Jon Battiston
Create a mesh network of devices that can discover and stream music to one another via a shared global playlist.
Tire Data Extraction
  • Akshay Verma
  • Adityan Harikrishnan
  • Roshni Murali
This project combines an iOS app with text extraction to read key data from the sides of automobile tires from images taken from the app camera.
Platter Restaurant App
  • Shubham Gulati
This project creates a proof of concept mobile app for startup Platter, which creates a subscription system for diners looking for discounts.

Applied NLP and Computational Social Science (CSE 702)

A Replication of Language Understanding for Text-based Games using Deep Reinforcement Learning
  • Yuhao Du
A Replication of Fake news on Twitter during the 2016 U.S. presidential election
  • Aamir Masood
  • Sanjay B.
Seq2Seq machine translation and gender bias analysis
  • Payraw Salih
  • Parth Shah
A Replication of Reducing Gender Bias Amplification using Corpus-level Constraints
  • Nishi Mehta
  • Pratik Kubal
A Replication of "From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews"
  • Dipannita Adhikary
  • Neeraj Abhyankar
A Replication of Exploring identity usage patterns in tweets and profile descriptions
  • Arjunil Pathak
  • Gokul Premaj
A Replication of “How Old Do You Think I Am?”: A Study of Language and Age in Twitter
  • Anish Gadekar
  • Shruti Bendale
  • Chris Chan

Independent Study

Anonymous public perception - An analysis of r/RoastMe and r/ToastMe
  • Niharika Raut
  • Gokul Premraj
Fake news in public WhatsApp groups during the 2019 Indian Election
  • Abhishek Bhave
  • Sidharth Pati
  • Ateendra Ramesh
What's influencing the president?
  • Akshada Chandrakant Bhor
  • Ruturaj Tukaram Molawade
SAM Research Database
  • Blake Cooper
  • Alex Liu
  • Lisa Kanbur
A sample-tracking database for the SAM Environmental Health Longitudinal Study.

Database Systems - Databake-Off (CSE 4/562)

Lone Wolf
  • Varsha Ganesh
This database once punched a table so hard that it became the first column store.
Neflix and Chill
  • Thejasweni Prakash Mysore
  • Anunay Rao
  • Apoorva Biseria
This database doesn't need indexes. It just intimidates tuples off the IO path.
Databass
  • Deepak Ranjan
  • Swarit Sanjiv Shah
  • Yash Narendra Saraf
Built using a volcano style implementation, this database buries its competition under a flow of molten tuples.
Megalodons
  • Aditya Agarwal
  • Shubham Gulati
  • Ram Vallabh Singh.
This database once computed π exactly.
slow mo guys
  • Hariprasath Parthasarathy
  • Syed Aqhib Ahmed
  • Mohammad Umair
Not actually a database, but actually a quantum simulation of all possible computer programs.
Lannisters
  • Srinivas Rishindra Pothireddi
  • Lakshmi Narasimhavihari Vemuri
  • Sri Harsha
A Lannister always pays his DBAs.

Past Demo Days

Fall 2016 Fall 2017 Fall 2018 Fall 2019 Spring 2019 Fall 2020 Fall 2021 Spring 2021 Fall 2022 Spring 2022 Fall 2023 Spring 2023 Spring 2024