Projects
Geoping Measurement Pipeline
We deployed a global measurement pipeline that automated ping experiments across a wide array of regions,
specifically from 27 Amazon Web Services(AWS), 38 Google Cloud Platform (GCP) and 36 Azure regions, targeting all internet router IP addresses.
Additionally, We developed Terraform scripts to efficiently set up cloud infrastructures and automate security
and computing resources on both GCP and Azure, improving our operational workflow and speed. |
Link
Inferring Multi-lateral Peering
This project aims to solve AS incompleteness problem.
It invloves mining BGP communities used by Internet exchange points route servers to implement multilateral peering.
We implemented an algorithm to mine these community values, extract the route server participants, and infer their export policies.
It was initally set up in 2015 and currently we focused on fixing the pipeline by modifying parsing logic for looking glass servers,
updating the looking glass servers' URL and querying mechanism and removing LG servers that are no more serviceable. |
Link
M-Lab Bigquery
This project aims to craft queries to garner speed test results and corresponding traceroutes from Comcast and AT&T network.
We created statistics for a specific NDT server to determine which service provider's router attempted to connect to a server provider's client. |
Link
Ookla Server Validation
This project aims to validate the geolocation of ookla servers by using the geoping measurement pipeline and launching measurements on RIPE Atlas. |
Link
Launching measurements to SpeedTest server from ARK Vantage Points
This project aims to launch measurements to Fast.com, which is an internet speed test service provided by Netflix from all ark vantage points parallely.
This employs the python module of Scamper.
'SurfStore': Scalable and Fault-tolerant Dropbox
We engineered a cloud file syncing tool with Go and gRPC, alongside a scalable block storage system capable of handling 100 nodes through consistent hashing.
We also integrated the RAFT protocol to ensure fault tolerance and maintain metadata consistency. |
Link
Converting 'Wildfire Detection using SmokeyNet' to Serverless
We transformed our machine learning pipeline to a serverless architecture by deploying training and inference processes on AWS Lambda and SageMaker, with data storage and management through EFS and S3.
This optimized pipeline manages data ingestion, image processing, and detection tasks efficiently, with costs kept below $0.00006 per invocation. |
Link
Acoustic Species Identification
We developed a multi-species bird classifier that analyzes audio data to identify bird species within recordings. Leveraging the VGG model, we trained this classifier on strongly labeled data, incorporating diverse data augmentation techniques to refine its accuracy.
Our efforts culminated in achieving a CMAP score of 0.65 in the BirdCLEF 2023 competition on Kaggle.|
Link
Publications