Biography
I am a Software Engineer at Meta (Integrity Org), working on Social Issues and Electoral Protocols. My focus is on ensuring content integrity, compliance, and transparency for paid contents at a massive scale.
Previously at Microsoft, I architected mission-critical data pipelines and operational databases. I specialize in Distributed Systems and building scalable infrastructure to reduce harm and ensure user safety.
Current Mission: Maintaining integrity and transparency in enforcement protocols across Meta's platforms.
Experience
Software Engineer
Meta · Content Integrity (Menlo Park, CA)
Leading LLM-based Ad Content Detection and Enforcement across LLAMA and Gemini model pipelines. Built the Political Ads investigation system handling ~140k ads/day. Designed Political Content LLM model with prompt engineering, token caching, and Google Search Grounding — outperforming human reviewers and saving millions in OpEx.
LLM Ad Content Detection and Enforcement
- Led a cross-functional team of Eng, DS, DE and Ops on the implementation of LLM detection and enforcement infrastructure from model onboarding through production enforcement, spanning both LLAMA and Gemini model pipelines.
- Built the Political Ads LLM investigation system that routes ads through proactive and reactive detection and enforcement paths. The system is capable of handling a detection workload of around 140k advertisements per day.
- Designed and implemented Political Content LLM model based on Gemini, including prompt engineering, multi-stage models, optimizing inference cost by token caching techniques, implementing Google Search Grounding for up-to-date information retrieval, which led to higher precision and recall metrics compared to human reviewers, saving millions of dollars in OpEx for contracting large scale human reviewers.
SDE
Microsoft · CSCP (Redmond, WA)
Led backend operational database for Azure's GPU Hub Deployment — migrated from Dataverse/PowerApps to Azure SQL DB with zero downtime. Built Infrastructure as Code using ARM, Bicep, and Azure DevOps. Oversaw data syncing between SQL DB and Delta Lake via ADF and Change Data Capture.
Azure Data Center GPU Lifecycle Management Data Platform
- Managing and leading a team of SDEs and vendors in system design and development of backend operational database infrastructure to support Azure's GPU Hub Deployment operations in supply chain, inventory management, data center build-out and life-cycle management, accelerating rapid deployment and real-time status update on GPU assets for Azure's AI computational capacity.
- Redesigned the database and backend system, migrating existing solution from Dataverse and PowerApps to SQL DB hosted on Azure, implementing tighter security with RBAC, redeploying business logic as SQL objects like Stored Procedures and Triggers, API on Data Service Layer and automated data validation mechanism. The migration process involves zero downtime.
- Led back-end infrastructure development for the new generation data lake used for transactional and analytical workload for Azure Data Center compute resource supply chain and demand forecasting. Led the team effort in developing Infrastructure as Code using ARM, Bicep, Yaml template files, Azure DevOps and CI/CD, establishing a safe and compliant deployment workflow with significant reduction in resource provisioning failures.
- Overseeing data syncing and migration mechanism between SQL Database and Delta Lake tables, using a combination of ADF Data Flow and Change Data Capture feature in SQL.
- Designed and implemented event-driven cross-region data replication mechanism; introduced enhanced Data Governance Model and Data Catalog capabilities to the data lake using Azure Purview.
- Team Champion for data security and compliance with company privacy and security policies; team lead for development of Business Critical Disaster Recovery Plan.
SDE
Microsoft · OPG (Vancouver, BC)
Led Time Series Anomaly Detection System processing terabytes of daily streaming data with 100K peak-hour jobs on Azure Container Services and CosmosDB. Pioneered Data Governance, Catalog and Lineage System. Built mission-critical M365 enterprise data pipelines achieving 5–10x efficiency gains.
M365 Data Platform — Time Series Anomaly Detection System
- Led and managed a cross-team effort of software engineers, data engineers and data scientists in developing a time series anomaly detection system featuring accurate and timely detection of spike and dip on terabytes of daily streaming data, easy onboarding of new data assets from multiple sources, extensible ML models, and automated notification system.
- Implemented a scalable and resilient backend job service using Microservice Architecture and CosmosDB, deployed and orchestrated using Azure Container Services and Azure Storage Queue — handling 100K jobs during peak hour and scaling to 0 during low traffic for cost saving.
- Designed and developed a unified Machine Learning platform on Azure Machine Learning, Azure Synapse and Spark, enabling data scientists from various teams to run AD models across all onboarded data assets.
- Designed and developed a real-time, event-driven notification system using Azure Storage Queue, containerized Docker applications and Azure Communication Services.
M365 Data Platform — Data Governance, Catalog and Lineage System
- Led and pioneered the development of Data Governance, Catalog and Lineage System, capable of data dependency tracking, data catalog and graph network visualization — became the go-to place for understanding data dependency and upstream/downstream analysis.
- Developed a Microservice Architecture using Azure Function, Docker and Azure Container Apps, with Azure Storage Queue as message broker between services, to execute and scale Azure Data Factory and Azure Data Lake Analytics job log analytics.
- Used CosmosDB/MongoDB for data dependency and catalog information; Redis Cache for recent queries and deduplication against repetitive jobs.
- Implemented both active and passive system health monitoring for every backend service component, using Kusto Data Explorer for job status tracking.
Office M365 Enterprise Client Data Pipeline and Data Analytics
- Led the development of several mission-critical data pipelines processing terabytes of enterprise user information for M365 Office Suite, using Data Factory, Synapse, AzureML, Data Lake Analytics and Data Explorer. Analytics scripts written in U-SQL and SparkPy; insights used by leadership for commercial campaigns and tenant health monitoring.
- Contributed significant efforts in optimizing legacy pipelines including skew reduction, redundancy reduction and maximizing parallel compute — achieving 5–10x data processing efficiency improvement with significant reduction in on-call effort and pipeline failure rate.
SWE
Altera / Intel PSG (Toronto, ON)
Led development of Intel Quartus FPGA visualization heatmap in C++/Qt — released as a key feature in Quartus 19.1, beating competition by two years. Optimized data parsing for 50M+ entries, reducing reaction time from ~10 min to <1s.
Logic Placement and Routing Visualization Heatmap
- Led the development of Intel Quartus software system dedicated to user data processing and visualization for FPGA resource allocation based on C++ and Qt, released as a pioneer product in the industry and highly relied upon by millions of customers for software design analysis and debugging.
- The final product was released and marketed as a key differentiating feature for Quartus 19.1, beating competition two years earlier to the market before a similar competing solution was introduced by competitor.
- Optimized hierarchical tree-structured data parsing and filtering algorithm, reducing reaction time from ~10 min to <1s for 50M+ data entries. Implemented lazy loading techniques to speed up report loading.
- Set Modular Design as the standard for software components; all components were later reused by fellow engineers. Created the software interface module for two-way interaction between backend and frontend GUI for Intel AgileX F-tile place and route software.
Technical Skills
Languages
Cloud Platform
Engineering
Education
University of Toronto
Master of Engineering, Computer Engineering
Jan 2016 - May 2017
Hong Kong University of Science and Technology
Bachelor of Engineering, Electronic & Computer Engineering
Sep 2011 - May 2015
Photography
Moments captured from my travels
Connect
Ready to collaborate on the next big thing?