Updates
|
- [Feb 2022] Started at Google (Core Data Infrastructure), Sunnyvale, CA as SWE
- [Jun 2021] Started at Amazon Web Services (Relational Database Service), E. Palo Alto, CA as SDE
- [Jul 2020] Accepted Research Assistantship under Prof. Remzi and Andrea Arpaci-Dusseau for Aug'20-May'21
- [Feb 2020] Incoming SDE Intern at Amazon Web Services (RDS Open Source), E. Palo Alto, CA for Summer'20
- [Jan 2020] Continuing with Teaching Assistantship at UW Madison for CS 559 (Computer Graphics) in Spring'20
- [Aug 2019] Received Teaching Assistantship at UW Madison for CS 559 (Computer Graphics) in Fall'19
- [Feb 2019] Accepted admit offer from University of Wisconsin-Madison with special CS scholarship
|
Key Internships
|
|
AWS Relational Database Service, Palo Alto, CA [May'20-August'20]
Manager: Jignesh Shah, Engineering Leader - Amazon RDS for PostgreSQL
Sandboxing of untrusted language procedures within RDS PostgreSQL
The goal was to facilitate sandboxed execution of untrusted functions, stored procedures and subtransactions inside Docker & LXC containers. This was to prevent customer programs from crashing the main database server instance and enforce finer access privileges while maintaining average transaction latency at par with equivalent extensions running locally on PostgreSQL
- Integrated an open-source extension PL/Container into PostgreSQL 12 & 13 using shared memory segments, grpc channels & Unix sockets for communication between customer and container backend
- Performed extensive benchmarking with different container configurations to identify bottlenecks like usage of separate Docker containers for every customer session leading to frequent spawning of containers
- Developed a new prototype leveraging a single container across all customer sessions, leading to a 63% reduction in memory usage while scaling to thousands of customer connections
- Added runtime support for Go language, and created 3 separate extensions for each of R, Python and Go utilizing separate containers for better isolation
- Received a Full Time Offer in recognition of the above work
|
|
University of Washington, Seattle, WA [May'19-August'19]
Supervisor: Prof. Arvind Krishnamurthy
Hardware Acceleration of Proxies
The aim was to decrease the latency incurred at the level of proxies running as host-based processes
- Worked on Layer 4 and Layer 7 load balancing of different proxies like Envoy, Nginx & HAproxy to demarcate functionalities for host and SmartNIC offloading
- Performed benchmarking experiments using wrk2 to determine feasibility of SSL certificate verification offloading and scalability, with a detailed study of Envoy worker threads
|
|
Adobe Systems (Research), Bengaluru [May'18-July'18]
Supervisor: Dr. Balaji Srinivasan, Sr Research Scientist, BigData Experience Labs
Characteristics-Tailored Summary Generation
Unlike typical abstractive text summarisation, the aim was to tune summaries to characteristics like being more formal as required by news agencies or focus on certain financial aspects as desired by corporate organisations
- Adapted Facebook AI Research's Convolutional seq2seq model for translation to topic-tuned summary generation with modified attention weights to focus on specific input embeddings
- Altered beam search paradigm for tweaking decoder state probability distributions, thus enhancing word-level features like descriptiveness with token-based learning for length based summarisation
- Incorporated a Reinforcement Learning term in loss function and achieved a 6.4% increase in ROUGE scores
- Implemented the above insights on pointer-generator framework and submitted a patent application (P8322-US) for the same at United States Patent and Trademark Office
- Received a Full Time Offer in recognition of the above work
|
|
Philips Innovation Campus, Bengaluru [May'17-July'17]
Supervisor: Dr. Rajendra Sisodia, Principal Scientist, Philips Research
Domain-specific Customer Care Chatbot
Modern chatbots perform well in conversations comprising simple question-answer pairs. The aim was to develop a semantic control algorithm to track context switches to predict favourable next steps in the conversation
- Designed a chatbot leveraging word2vec, Latent Semantic Indexing and Latent Dirichlet Allocation for topics relevant to user query with tf-idf weighted word n-grams for improving accuracy
- Incorporated probabilistic finite automata to model conversation state changes guided by sentiment scores
- Built an emotion classifier SVM, and ontologies for knowledge representation from RDF sources with SPARQL queries for fetching data
Brief Report
|
Publications
|
|
Generating summaries tailored to target characteristics
20th International Conference on Computational Linguistics and Intelligent Text Processing (CiCLing 2019)
Recently, research efforts have gained pace to cater to varied user preferences while generating text summaries. While there have been attempts to incorporate a few handpicked characteristics such as length or entities, a holistic view around these preferences is missing and crucial insights on why certain characteristics should be incorporated in a specific manner are absent. With this objective, we provide a categorization around these characteristics relevant to the task of text summarization: one, focusing on what content needs to be generated and second, focusing on the stylistic aspects of the output summaries. We use our insights to provide guidelines on appropriate methods to incorporate various classes characteristics in sequence-to-sequence summarization framework. Our experiments with incorporating topics, readability and simplicity indicate the viability of the proposed prescriptions.
pdf version
|
Research Projects
|
|
Learned strategies for Key-Value Stores
Supervisor: Prof. Remzi & Andrea Arpaci-Dusseau, University of Wisconsin-Madison
Log-Structured Merge (LSM) based key-value stores have become so popular today that they are used as backend for NewSQL database abstractions like TiDB. They use indexes for faster data lookup whose memory overhead increases with database size leaving lesser memory for caching data blocks. We employ learned techniques to reduce this space overhead without compromising read latencies. We also explore learning approaches to prioritize compactions and devise new compaction policies
- Block-based storage media like SSDs read in granularity of data blocks. Traditional indexes store the last key of each such block and perform binary search on these entries to fetch corresponding data block for lookup
- Employed Learned Indexes to train a model to learn offsets from last keys of data blocks inside SSTables when created during compactions to reduce lookup time from O(logn) to O(1)
- Obtained a 53% reduction in indexing memory footprint over traditional indexes with <5% increase in point lookup latency using both Fuzzy and Greedy Piecewise Linear Regression in RocksDB
- Range query latencies show 11.4% reduction by picking SSTables with most user reads at a level for compaction thus minimizing seeks per read over default policy
Learned Compactions Slides | Learned Indexes Slides | Report
|
|
Dataplane-Only Policy-Compliant Routing under Failures
Supervisor: Prof. Aditya Akella, University of Wisconsin-Madison
On failures, routers typically have inconsistent state, which leads to high convergence times. In such cases, the central software controller could be a bottleneck and finding policy-compliant paths is hard. We propose for computation of such paths in the data plane with a central policy plane across end-host interfaces
- Performed several experiments on search algorithms used to compute routes in the data plane using P4 stacks and recirculation using software emulation on mininet for the RocketFuel set of toplogies
- Analysed performance relative to other latency-aware routing algorithms to establish stretch as a function of number of links
- Provided support for Weighted Cost Multipath load balancing with dynamic weights on a per-packet basis
- Added per-session, per-flow and per-packet consistency using register-caching of policies to avoid excessive recirculations
Slide Deck 1 | Slide Deck 2 | D2R preprint
|
|
Adaptive SmartNIC-accelerated micro-load balancing
Supervisor: Prof. Aditya Akella, University of Wisconsin-Madison
Typical traffic trace dumps from network simulators, or mathematical simulations of network topologies with variation in queuing, sending rates, etc can aid in online learning of weights in an effort to load-balance traffic across several outgoing links. We explored such approaches utilizing SmartNIC / P4 switch computations
- Explored the use of RL-based actor-critic algorithms in designing traffic matrices to perform in-network load balancing to ensure max-min fairness and minimal queuing for Clos data-center networks
- Implemented a compressed neural network version of the same on P4 switches with weighted multi-path load balancing
- Harnessed SmartNIC-compute power to perform computation of weights at end-hosts to maintain line-rate processing at switches
Slide Deck for P4 version
|
|
State Replication and Fault Tolerance in P4
Supervisor: Prof. Mythili Vutukuru, Indian Institute of Technology Bombay
The project aims at replicating locally stored states in the primary switch to the secondary switch in real-time to avoid loss of state information in case of failures. Locally stored states aid in packet processing at line rate
- Constructed a synchronous cum asynchronous write-consistent bmv2 model to store "hard" network states (which can't be recovered from flow statistics) on the switch with consistent migration across backup switches in the data plane without control plane intervention
- Achieved faster flow switchover compared to root controller-mediated state updates (where the controller stores and syncs such states across all switches)
- Proposed an annotation-based API for a generalized fault-tolerant primitive to be incorporated in p4c
Report | Presentation
|
|
Imaging Techniques with Raman Spectroscopic Imaging
Supervisor: Prof. Ajit Rajwade, Indian Institute of Technology Bombay
Typical Raman spectroscopy takes a very long acquisition time and is used for diagnosing critical diseases like cancer. The aim of this project is to reduce the acquisition time without compromising on quality
- Learned a compact representation of paraffin subspace for spectral separation of biopsy sample using Nonnegative Sparse Coding, employing Blind Dictionary Learning with PCA for signal and noise separation
- Performed inpainting to enable compressed sensing of Raman spectral images, to speedup image acquisition
- Extended the same to the super-resolution use case with significant improvements over simple bicubic interpolation
- Achieved better results with Gaussian Mixture Models trained on a smaller representative set using the Expectation-Maximization (EM) algorithm
Report | Presentation | ICIP 2019 Paper Draft
|
|
Benchmarking of Software Switches
Supervisor: Prof. Mythili Vutukuru, Indian Institute of Technology Bombay
VPP and Open vSwitch are currently the fastest DPDK-based software switches out there. The aim was to determine the minimal resources required for optimal performance of a switch for different use cases
- Tested latency, throughput, efficiency in terms of cycles per packet with increasing cores, routing table entries and hierarchical cache sizes using uniform and skewed Gaussian traffic loads of 10 Gbps generated with DPDK-based packet generator MoonGen
- Analyzed VPP's batch packet processing paradigm and tested batch size as a function of different parameters
- Studied Cisco Express Forwarding implemented using multiway prefix trees in VPP, patented by Cisco
Report | Presentation
|
|
Optimizing Performance of Model-Counting Algorithms
Supervisor: Prof. Kuldeep Meel, National University of Singapore
The task involved a study of different model-counting algorithms, which enumerate solutions to a boolean formula. The aim was to identify performance bottlenecks in the implemented model for optimization
- Studied the SPARSE-COUNT algorithm and extended the same using GMP & MPFR libraries to support arbitrarily large number of variables and multi-precision computations
- Implemented the above in ApproxMC framework which is a similar framework for model-counting, using θ(logn) low-density parity constraints with tolerance guarantees for results within a specified confidence interval
- Results were validated using the IJCAI'16-CMV benchmarks
Report
|
Course Projects
|
|
Data-driven optimizations for Log-structured Merge Trees
Supervisor: Prof. Xiangyao Yu & Paris Koutris, University of Wisconsin-Madison
Modern database systems leverage key-value stores based on Log-Structured Merge (LSM) trees for storing metadata requiring fast lookups and updates. However, recent studies show that application throughput can be compromised by internal LSM tree operations that periodically write data to disk. The existing background work scheduler on Google's LevelDB was improvised for better application performance. Runtime parameters like memtable and SSTable sizes, triggers for compaction and stalling writes were auto-tuned using a Bayesian Optimizer, MLOS to adapt to various workloads. The foreground writes were decoupled from background memtable flushes and compactions and showed 2.24-2.34X improvement for industrial write bursty workloads in terms of observed client write throughput by scheduling these background operations during idle periods or when there were very few or no writes to the database
Project Report | Slides
|
|
Centralized vs Decentralized Stochastic Optimization Algorithms
Supervisor: Prof. Shivaram Venkataraman, University of Wisconsin-Madison
Federated learning entails training statistical models directly over numerous remote devices using local data leveraging their storage and computation capabilities. Security and data privacy concerns have pushed computation to the edge in contrast to classic ML training over centralized servers within datacenters. Centralized approaches like Parameter Server, Elastic Averaging SGD are compared with variants of Decentralized-PSGD. Biased and unbiased gradient compression operators like top-K, random-K, quantization via ECD-PSGD, DCD-PSGD and ChocoSGD are explored, with communication-computation overlap via Asynchronous D-PSGD to reduce idle time using bounded stale gradients. Training statistical efficiency is observed over time for bits transmitted across topologies like torus, ring with Stochastic Gradient Push (SGP) for directed graphs. Decentralized SGD algorithms with compressive communication are on par in convergence guarantees with their centralized counterparts
Project Report | Slides | Github repo
|
|
Count-Min Sketches for Network Traffic Scheduling
Supervisor: Prof. Aditya Akella, University of Wisconsin-Madison
Recent active queue length management algorithms like RED, ECN, CoDel probe queue length to throttle sending rate across all senders. However, they do not aim to identify contributing flows as the root cause of queue build-up. Here, I explored Count-Min Sketches which overcome scalability issues of per-flow counters and dynamic allocation issues of hashmaps to accurately record per-flow queue occupancy. Distributed across snapshots comprising a fixed number of packets (alternately, of specified time intervals), each snapshot utilizes a count-min sketch based on register arrays in P4 to track flows in that interval, while cleverly reusing them after a certain total packet count. I tested my implementation on mininet with ECN-based feedback notifications to senders, thus utilizing Flow Completion time as a metric to demonstrate the effectiveness of this approach. I also used a C++ simulator with Pcapplusplus on UW Data Center Measurement Trace to evaluate Precision and Recall of the "contributing flow" classifier
Project Report | Github repo
|
|
Accelerating Image Segmentation with Parallel Computing
Supervisor: Prof. Dan Negrut, University of Wisconsin-Madison
Image segmentation is used in medical imaging for tumour detection and edge detection for tracing blood capillaries and roadside kerbs for autonomous driving. Here, we try to accelerate the algorithms using CUDA, leveraging GPU and hybrid OpenMP+MPI approaches, leveraging multicores. We demonstrate optimizations like SIMD, loop unrolling, use of templates, forced inlining along with use of shared and unified memory, while exposing need for constructs like atomic/critical sections and thread synchronization/barriers through the implementation of Sobel & Canny edge-detectors and the Fuzzy C-Means algorithm for segmentation. Our work focuses on using CUDA streams, dynamic parallelism and thrust library along with OMP tasks to achieve high speedup
Project Report | Github repo
|
|
Tetrisbot
Supervisor: Prof. Zick Yair, National University of Singapore
We designed a utility-based agent based on genetic algorithms, using a set of 10 state-dependent features like numer of holes, height differences between adjacent columns, max height of a column, etc. We used the single-point crossing over heuristic and implemented a multithreaded training approach random independent block sequences in parallel. Particle swarm optimization was also employed along with this for optimal convergence of weights to add an exploratory component. We achieved a maximum of over 856,000 cleared rows. Additionally, we implemented an auto-encoder approach with Q-learning for a low dimensional game state representation. Though not quite successful with Tetris, we demonstrated a simple game "Catch the Ball" with the above approach to demonstrate its effectiveness
Report | Github repo | A study of genetic algorithms
|
|
Legal Case Retrieval System
Supervisor: Prof. Zhao Jin, National University of Singapore
We designed a freetext search engine supporting both phrasal and boolean queries, leveraging NLTK to retrieve and rank legal case judgments. We finished 2nd out of 33 teams on the leaderboard based on the assignment given by the Singapore-based legal intelligence firm, Intellex. Positional indices were implemented to aid proximity search with additional zone and field indices like court hierarchy, legal case dates to aid in retrieval. We were able to get a high F1 score using various query expansion techniques like pseudo relevance feedback using the Rochhio algorithm, WordNet synonyms and co-occurence thesaurus generated from the corpus dictionary. We used the LNC model of tf-idf for freetext search
Github repo
|
|
Stereo Image Reconstruction using Energy Minimization
Supervisor: Prof. Cheong Loong Fah & Prof. Feng Jiashi, National University of Singapore
I implemented normalized graphcuts with α-expansion for image segmentation and denoising using multilabel 8-connected Markov Random Fields, and compared the same with mean-shift algorithm. I employed the PatchMatch algorithm to establish patch correspondences, for better alignment for homography. The other component involved obtaining dense correspondences from two images belonging to different viewpoints using manual methods and KLT tracker, to estimate the Fundamental Matrix using the 8-point algorithm
Component 1 Report | Component 2 Report
|
|
Generation of Nintendo Entertainment System Game layouts
Supervisor: Prof. Ganesh Ramakrishnan, Indian Institute of Technology Bombay
We built a Deep Convolutional GAN model on pytorch for generating new game levels, i.e. tile sheets from previous game layouts. We used Leaky ReLU as the activation function for both the discriminator and generator with the Adam Optimizer for stochastic gradient descent
Brief Presentation
|
|
A Detailed Study and Comparison of General-Purpose Fuzzers
Supervisor: Prof. Barton Miller, University of Wisconsin-Madison
We made a comparison of general-purpose mutation-based grey-box fuzzers like libFuzzer, American Fuzzy Lop (AFL) and honggfuzz and evaluated their performance on the Google fuzzer-test-suite across 24 applications on metrics like code coverage (basic blocks and edges) and bug-finding capabilities. We found a new unreported bug in pcre2-10.0 with the key finding that only libFuzzer can find memory leaks with the help of LeakSanitizer. Also proposed a new framework for ensemble fuzzing which uses different base fuzzers in tandem
Project Report | Poster
|
|
Figaro : A Probabilistic Programming Language
Supervisor: Prof. Razvan Voicu & Prof. Chin Wei Ngan, National University of Singapore
Explored the Probabilistic Programming Monad in Figaro, which combines the object-oriented paradigm with the functional programming paradigm in Scala. Modeled real-life problems using Bayesian Networks with inference algorithms like Variable Elimination, Belief Propagation and Dynamic Reasoning algorithms like Factored Frontier. Simulated a simple market model using Decision Models to calculate the optimal policy. Extended the language by implementing a new Element class to model the distribution of the maximum value of a random variable, sampled from 0 to a given upper bound
Report | Presentation | Github repo
|
|
Image Quilting for Texture Synthesis and Transfer
Supervisor: Prof. Suyash Awate & Prof. Ajit Rajwade, Indian Institute of Technology Bombay
We employed the Efros & Leung algorithm to synthesize larger textures, and used the same algorithm with a modified cost function for iterative texture transfer to target images using correspondence maps. We also implemented the minimal error boundary cut using dynamic programming to avoid block-seam artifacts
Report
|
Relevant Coursework
|
- Systems: Adv & Intro to Operating Systems, Adv & Intro to Computer Networks, Adv & Intro to Databases, Big Data Systems, Machine Learning-Optimized Systems, High Performance Computing, Foundations of Data Management, Intro to Information Security, Computer Architecture, Implementation of Programming Languages
- AI/ML: Computer Vision, Advanced & Digital Image Processing, Foundations of Machine Learning, Machine Learning (Graduate Level), Artificial Intelligence, Information Retrieval, Optimization
- Statistics & Maths: Regression Analysis, Statistical Inference, Probability Theory, Derivative Pricing, Numerical Analysis, Data Analysis and Interpretation, Linear Algebra, Differential Equations, Calculus
- Theory: Cryptography, Automata Theory, Logic in Computer Science, Discrete Structures, Design and Analysis of Algorithms, Data Structures & Algorithms, Abstractions and Paradigms for Programming
- Others: Computer Graphics, Software Systems, Digital Logic Design, Computer Programming and Utilization, Electrical and Electronic Circuits, Quantum Physics and its Applications, Economics
|
Other Internships
|
|
Focus Analytics, Mumbai [Nov'17-Dec'17]
Supervisor: Sudin Kadam, Head of Research
Contextual Marketing for Retail Analytics
- Leveraged topic-modeling and word2vec similarity scores for customer segmentation and retail-affinity estimation using gensim and SpaCy
- Implemented a probabilistic graphical model based recommendation engine, contributing to pgmpy github repo
- Created a new query language with pyparsing for internal database system on neo4j, utilizing EBNF grammar rules
Brief Report
|
|
OliveSync, Zone Startups India, Mumbai [Dec'16]
Supervisor: Ketan Ghatode, CTO, OliveSync Pvt. Ltd.
Automated Timetable Generation
- Designed a scheduling algorithm leveraging genetic algorithms to generate the best fit optimal timetable for institutions
- Added live sync to MySQL database on PHP to track occurrence of classes, and course adjustments
- Employed Gale-Shapley algorithm for alloting time slot priorities to students and professors
Brief Report
|
Other Projects
|
A Java-like Compiler for OCaml
Supervisor: Prof. Razvan Voicu & Prof. Chin Wei Ngan, National University of Singapore
Designed an abstract syntax tree comprising unary and binary operations, conditionals, functions, recursive functions, applications and let constructs on various data types for the compiler, utilizing the Gram parser for parsing instructions. Implemented a virtual machine instruction interpreter with an operand stack for control flow. Performed type-checking using Hindley Milner type inference with support for optional data types. The compiler was further optimized to leverage tail recursion and contiguous stack frames
|
OpenGL based 3D Animation Film
Supervisor: Prof. Cheng Ho-lun Alan, National University of Singapore
Dynamic rendering techniques were used to create this animation film, based on OpenGL's various timer functions. Different camera transformations were used like dolly zoom to add artistic effects. I used motion simulation along Bezier curves, adding soft shadows and transparency effects using Ray tracing. For object modeling, Phong illumination and Phong shading were used with texture mapping and bump mapping to mimic real-life surfaces
|
PokeDB : A Pokemon RPG Game
Supervisor: Prof. S Sudarshan, Indian Institute of Technology, Bombay
We built a multiplayer Pokemon game on PostgreSQL backend with JDBC API from pokeAPI JSON data with over 14,000 tuples. Online gym battles, navigable maps with probability models for capturing wild pokemon and evolution of pokemon with battle experience were also added
Report
|
Feed'er : An All-purpose Academic App
Supervisor: Prof. Sharat Chandran, Indian Institute of Technology, Bombay
Developed an integrated Android and Django based web app for displaying submission deadlines, exam dates and other important reminders via push-notifications. Implemented automatic sync and signup with social logins, with security measures against XSS, CSRF etc
User Manual | Presentation
|
Ethernet-enabled ATM Controller
Supervisor: Prof. Supratik Chakraborty, Indian Institute of Technology, Bombay
Developed an Ethernet-enabled FPGA module on VHDL to dispense cash leveraging greedy algorithm in Xilinx ISE, with Tiny Encryption algorithm to provide secure exchange of user data. Enforced insufficient balance, incorrect pin using LED displays, and frontend caching to protect against server crashes
|
Ldap Authenticated Chat Application
Supervisor: Prof. Varsha Apte, Indian Institute of Technology, Bombay
A server-client model with X11 based GUI was developed using Socket programming, with LDAP Authentication using openLDAP. Additional functionality for group chats, offline inbox via hashmaps and multimedia message exchanges were also supported
Slides
|
Movie Recommendation Engine
Supervisor: Prof. Sharat Chandran, Indian Institute of Technology, Bombay
Designed a python program for generating correlation between the user and critic rating based on Euclidean distances. Using the critic ratings, generated a list of recommended movies for the user sorted according to ratings weighted by Pearson correlation coefficient calculated using similarity between user and critic's rating
|
Simulation of Rube Goldberg Model
Supervisor: Prof. Sharat Chandran, Indian Institute of Technology, Bombay
Designed and simulated a Rube Goldberg Machine using Box2D, a physics simulation engine in C++, which involves compilation and linking to libraries like GLUI (GLUT based C++ user interface library). Designed a Star Wars arena by rendering attraction, repulsion among magnetic objects
|
Sudoku GamePlay Software
Supervisor: Prof. Amitabha Sanyal, Indian Institute of Technology, Bombay
Built a GUI based solver on MIT Scheme with features like Undo, Auto-solve, and filters for seeding games of varying difficulty levels. Employed backtracking algorithm to solve any given initial configuration
Report
|
Text Processor
Supervisor: Prof. Varsha Apte, Indian Institute of Technology, Bombay
Built a class for enumeration of characters, words with support for Find and Replace using Knuth Morris Pratt algorithm, regular expressions, LZW compression and encryption and decryption via Caesar cipher
|
Body Fat Estimation
Supervisor: Prof. Chan Yiu Man, National University of Singapore
Estimated body fat mass using stepwise regression with statistical tests to check for multicollinearity, lack of fit, outliers and influential points derived from cook's distance, dffits, dfbetas, studentised residuals implemented in R. The same was validated with partial F-test for the significance of model and Durbin-Watson test with Kolmogorov-Smirnov test for testing the independence and normality of residuals
Report
|
A Study of Statistical Tests and Sampling Algorithms
Supervisor: Prof. Radhendushka Srivastava, Indian Institute of Technology, Bombay
Performed a critical study and simulation of the Random Excursions test and the famous sampling algorithm, Metropolis Hastings Algorithm
Random Excursions Report | Metropolis Hastings Algorithm Report
|
|