Jing Xiong

Ph.D. in Electrical Engineering

Stanford University

Email: jing.xiong@cs.stanford.edu

Interests: Computer Vision, Natural Language Processing, Automatic Speech Recognition

2019 - 2021 Google Lens

2021-2023 Google Cloud AI

Prior to joining Google, I was a research assistant in the VLSI Research Group, working on using computer vision to solve neuroscience problems with Professor Mark Horowitz. I worked closely with the Luo Lab and was co-advised by Professor Liqun Luo

Education

Ignite - Certificate Program in Innovation and Entrepreneurship, Stanford School of Business, 2020

Ph.D. - Electrical Engineering, Stanford University, 2019

M.S. - Electrical Engineering, Stanford University, 2015

B.S. - Electrical Engineering, Mathematics Minor, Summa Cum Laude, High Distinction, University of Minnesota, Twin Cities, 2012

Research Project - Histological slices to Allen Mouse Brain Atlas mapping tool

Histological brain slices are widely used in neuroscience to study the anatomical organization of neural circuits. Systematic and accurate comparisons of anatomical data from multiple brains, especially from different studies, can benefit tremendously from registering histological slices onto a common reference atlas. Most existing methods rely on an initial reconstruction of the volume before registering it to a reference atlas. Because these slices are prone to distortions during the sectioning process and often sectioned with non-standard angles, reconstruction is challenging and often inaccurate. Here we describe a framework that maps each slice to its corresponding plane in the Allen Mouse Brain Atlas (2015) to build a plane-wise mapping and then perform 2D nonrigid registration to build a pixel-wise mapping. We use the L2 norm of the histogram of oriented gradients of two patches as the similarity metric for both steps, and a Markov random field formulation that incorporates tissue coherency to compute the nonrigid registration. To fix significantly distorted regions that are misshaped or much smaller than the control grids, we train a context-aggregation network to segment and warp them to their corresponding regions with thin plate spline. We have shown that our method generates results comparable to an expert neuroscientist and is significantly better than reconstruction-first approaches.Research Projects

[paper][website][code][slides]

Research Project - Anatomical study of the dorse raphe serotonin sub-systems

The dorsal raphe (DR) constitutes a major serotonergic input to the forebrain and modulates diverse functions and brain states, including mood, anxiety, and sensory and motor functions. Most functional studies to date have treated DR serotonin neurons as a single population. Using viral-genetic methods, we found that subcortical- and cortical-projecting serotonin neurons have distinct cell-body distributions within the DR and differentially co-express a vesicular glutamate transporter. Further, amygdala- and frontal-cortex-projecting DR serotonin neurons have largely complementary whole-brain collateralization patterns, receive biased inputs from presynaptic partners, and exhibit opposite responses to aversive stimuli. Gain- and loss-of-function experiments suggest that amygdala-projecting DR serotonin neurons promote anxiety-like behavior, whereas frontal-cortex-projecting neurons promote active coping in the face of challenge. These results provide compelling evidence that the DR serotonin system contains parallel sub-systems that differ in input and output connectivity, physiological response properties, and behavioral functions.

[paper]

Industry Project - Improved ranking of web results for image based search using click data at Google Lens

Initiated, designed, implemented, and generated preliminary results that used user click data to improve image search ranking at Google Lens. The feature uses historical user click data combined with image similarity to rank image search results more effectly. Live experimented Q1 2022.

[Defensive publication]

Industry Project - End-to-end "Top match" feature at Google Lens

Tech lead of the "Top match" (now "high confidence clusters") feature of Google Lens. The feature differentiates and highlights high confidence image results from all retrieved similar images results for user image queries. Directly and primarily responsible for developing the new end-to-end feature. Created and drove the feature road map. Designed and implemented the algorithm that determines high confidence image answer for image queries. Collaborated with 4 different Google teams and across 2 time zones to deliver the feature from scratch in < 3 months. Resulted in 3pp E2E quality improvement and powers about 350M queries per month. Initial launch in July 2020.

Industry Project - Unified image similarity model at Google Lens

Tech lead of the image similarity ranking model at Google Lens. Responsible for bringing together the visual intelligence of multiple server-side vision models to rank image results from all Lens verticals effectively. Aligned requirements from 4 Lens teams, unified the image similarity definition across all Lens verticals and collected ground truth data for model training. Led engineers from 3 Google teams on training and deploying a unified image similarity scoring model in Lens backend system with neutral latency change, better result quality, and a new SW+AI architecture. This model scores and ranks retrieved images from all Lens traffic. Initial launch in June 2021.

Winner of Silver Perfy Award in Google for capacity management in Q3 2021.

Industry Project - Trust & safety results for people-related queries

Invented and implemented the end-to-end computer vision cascade that enables trust & safety compliant results queries that contains human by showing results from safe results from non-people-sensitive regions in a query that contains people or face. Launched June 2020.

[Defensive publication]

Industry Project - On-device suggested action on Google Photos

Point of contact of on-device visual intelligience for Lens on Photos. Responsible for brining the visual intelligience of server-side vision models to mobile despite stringent compute and power constraints. Built and improved core on-device computer vision cascade for suggsted action of a feature. Privacy-preserving, compact, entirely built from distilled on-device models. Collaborated across 3 different organizations to bring the feature from scratch to live experiment in < 4 months. Conceived and implemented a new SW+AI architecture to enable E2E user privacy without sacrifacing quality and scale. Nov 2019 - Mar 2020.

[Defensive publication 1]

[Defensive publication 2]

Course Project - CNN-based segmentation on NISSL-stained histological images

We adopted the structure of the fully convolutional network for this segmentation problem. We trained a model to segment an experimental histological image into main brain regions - grey (cerebrum, brainstem, and cerebellum), fiber tracts, and ventricular systems - and background  and achieved 96.1% accuracy on the test reference slices and 92.1% accuracy on test experimental datasets. Network is mainly trained on reference images because of the limitation on segmented experimental data.
[project report]

Course Project - Hidden emotion detection through analyzing facial expression

The Android app we developed takes video frames of a human face from camera as input and outputs a fusion image of extracted facial features and contours and a motion distribution map. The motion distribution map is generated based on MicroExpression hotmap with special color added. The brightness of the color is scaled by the magnitude of motion in each different area on the face. The client, an Android device, gets the initial location of eyes and mouth. Covariance based image registration is used to generate motion distribution of facial features on the server side. The fusion image generated with the information is then sent back to the client for display. Users can learn from this fusion image about micro changes of face features and thus interpret the human emotions. Since more than key points of facial features are extracted, we expect full utilization of our data to give precise interpretation proovided a robust scoring system of motions of different facial features and contours.
[report]

Course Project - Recommender system utilizing users' listening history and social network information

We implemented a music recommender system based on users' listening history and social network. We used collaborative filtering with both user-based and item-based strategies. For user-based collaborative filtering, we measured users' similarity with both the binary information and actual play count in their listening history. Our methods significantly increased the accuracy of recommendation. Furthermore, we modified the user-based collaborative filtering algorithm and came up with a method that combined the users' listening history and social relationships for music recommendation.
[report]

PhD Thesis

Talks

Stanford Imaging Symposium 9/17/2018.

Stanford Center for Image Systems Engineering (SCIEN) Industry Affiliates Meeting 2018.

Biomedical Computation at Stanford Symposium 4/4/2016.

Stanford Bio-X IIP Symposium 2/17/2016.

Center for Biomedical Imaging at Stanford Symposium 4/29/2015.

Invited Talks:

Neuroscience Conference 2018

CSIT 2021

AnalytiX2021

Awards and Honors

Friends of Music Applied Music Scholarship, Stanford University, 2015 - 2016

Stanford Graduate Fellowship, Stanford University, 2012 - 2015

Albert George Oswald Prize, University of Minnesota, 2011 - 2012

KSP & Kumar Scholarship, University of Minnesota, 2010

Miscellaneous

I love scuba diving. I was involved with the Guzheng Community at Stanford when I wasn't that busy with research, and have performed in the Stanford Chinese New Year Spring Gala at the Memorial Auditorium and Stanford Guzheng Ensemble Concert.

I enjoy watching basketball games and am certified as a secondary basketball referee in China.