Vive Publications

Vive Center Publications

2024

Bjoern Hartmann

Design Space Exploration for Board-level Circuits: Exploring Alternatives in Component-based Design
Richard Lin, Rohit Ramesh, Parth Nitin Pandhare, Kai Jun Tay, Prabal Dutta, Bjoern Hartmann, and Ankur Mehta
CHI 2024
In this work, we examine user-guided design space exploration as a middle ground between intuitive-but-ambiguous high-level representation of a circuit and a fully-specified, fabrication-ready circuit.
full paper local pdf, doi

Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation
Susan Lin, Jeremy Warner, J.D. Zamfirescu-Pereira, Matthew G Lee, Sauhard Jain, Shanqing Cai, Piyawat Lertvittayakumjorn, Michael Xuelin Huang, Shumin Zhai, Bjoern Hartmann, Can Liu
CHI 2024
This paper presents Rambler, an LLM-powered graphical user interface that supports gist-level manipulation of dictated text with gist extraction and macro revision.
full paper local pdf, doi, show video

Prompting for Discovery: Flexible Sense-Making for AI Art-Making with Dreamsheets
Shm Garanganao Almeda, J.D. Zamfirescu-Pereira, Kyu Won Kim, Pradeep Mani Rathnam, Bjoern Hartmann
CHI 2024
How can interfaces support end-users in reliably steering prompt-space explorations to- wards interesting results? Our design probe, DreamSheets, supports user-composed exploration strategies with LLM-assisted prompt construction and large-scale simultaneous display of generated re- sults, hosted in a spreadsheet interface.
full paper local pdf, doi

Generating Automatic Feedback on UI Mockups with Large Language Models
Peitong Duan, Jeremy Warrner, Yang Li, Bjoern Hartmann
CHI 2024
We explore the potential of using large language models for automatic feedback. Specifically, we focus on applying GPT-4 to automate heuristic evaluation, which currently entails a human expert assessing a UI’s compliance with a set of design guidelines.
full paper local pdf, doi, show video

Yi Ma

Open the Black Box of Transformers:
White-Box Transformers via Sparse Rate Reduction: Compression is All There is?
Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, and Yi Ma, arXiv:2311.13110, JMLR, 2024.

Representation Learning via Manifold Flattening and Reconstruction,
Michael Psenka, Druv Pai, Vishal Raman, Shankar Sastry, Yi Ma, arXiv:2305.01777, accepted by JMLR, February 2024.

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning,
Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Yifei Zhou, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, and Sergey Levine, NeurIPS 2024.

Scaling White-Box Transformers for Vision,
Jinrui Yang, Xianhang Li, Druv Pai, Yuyin Zhou, Yaodong Yu, Yi Ma, and Cihang Xie, NeurIPS 2024.

Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation,
Qingwen Bu, Jia Zeng, Li Chen, Yanchao Yang, Guyue Zhou, Junchi Yan, Ping Luo, Heming Cui, Yi Ma, and Hongyang Li, NeurIPS 2024.

Lessons from Learning to Spin Pen,
Jun Wang, Ying Yuan, Haichuan Che, Haozhi Qi, Yi Ma, Jitendra Malik, Xiaolong Wang, Conference on Robot Learning (CoRL), 2024.

A Global Geometric Analysis of Maximal Coding Rate Reduction,
Peng Wang, Huikang Liu, Druv Pai, Yaodong Yu, Zhihui Zhu, Qing Qu, and Yi Ma, ICML 2024.

Learning a Diffusion Model Policy from Rewards via Q-Score Matching,
Michael Psenka, Alejandro Escontrela, Pieter Abbeel, and Yi Ma, arXiv:2312.11752, ICML 2024.

Differentially Private Representation Learning via Image Captioning,
Tom Sander, Yaodong Yu, Maziar Sanjabi, Alain Oliviero Durmus, Yi Ma, Kamalika Chaudhuri, and Chuan Guo, arXiv:2403.02506, ICML 2024.

ViP: A Differentially Private Foundation Model for Computer Vision,
Yaodong Yu, Maziar Sanjabi, Yi Ma, Kamalika Chaudhuri, and Chuan Guo, arXiv:2306.08842, ICML 2024.

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs,
Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann LeCun, and Saining Xie, arXiv:2401.06209, oral presentation, CVPR 2024.

Masked Completion via Structured Diffusion with White-Box Transformers,
Druv Pai, Sam Buchanan, Ziyang Wu, Yaodong Yu, and Yi Ma, arXiv:2404.02446, ICLR 2024.

RLIF: Interactive Imitation Learning as Reinforcement Learning,
Jianlan Luo, Perry Dong, Yuexiang Zhai, Yi Ma, and Sergey Levine, arXiv:2311.12996, ICLR 2024.

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models,
Tianzhe Chu, Shengbang Tong, Tianjiao Ding, Xili Dai, Benjamin David Haeffele, Rene Vidal, and Yi Ma, arXiv:2306.05272, ICLR 2024.

Investigating the Catastrophic Forgetting in Multimodal Large Language Model Fine-Tuning,
Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, Yi Ma, CPAL 2024.

Closed-Loop Transcription via Convolutional Sparse Coding,
Xili Dai, Ke Chen, Shengbang Tong, Jingyuan Zhang, Xingjian Gao, Mingyang Li, Druv Pai, Yuexiang Zhai, Xiaojun Yuan, Heung-Yeung Shum, Lionel Ni, Yi Ma, CPAL 2024.

Unsupervised Learning of Structured Representation via Closed-Loop Transcription,
Shengbang Tong, Xili Dai, Yubei Chen, Mingyang Li, ZENGYI LI, Brent Yi, Yann LeCun, Yi Ma, CPAL 2024.

Emergence of Segmentation with Minimalistic White-Box Transformers,
Yaodong Yu, Tianzhe Chu, Shengbang Tong, Ziyang Wu, Druv Pai, Sam Buchanan, Yi Ma, arXiv:2308.16271, CPAL 2024.

Ren Ng

A. Kotani and R. Ng, “A Computational Framework for Modeling Emergence of Color Vision in the Human Brain,” arXiv preprint arXiv:2408.16916, Sep. 2024.

J. Lee, N. Jennings, V. Srivastava, and R. Ng, “Theory of Human Tetrachromatic Color Experience and Printing,” ACM Transactions on Graphics (TOG), vol. 43, no. 4, pp. 1–15, Aug. 2024.

James O’ Brien

Effect of Duration and Delay on the Identifiability of VR Motion
Mark Miller, Vivek Nair, Eugy Han, Cyan DeVeaux, Christian Rack, Rui Wang, Brandon Huang, Marc Latoschik, James F. O’Brien, Jeremy N. Bailenson
SePAR 2024

Effect of Data Degradation on Motion Re-Identification
Vivek Nair, Mark Roman Miller, Rui Wang, Brandon Huang, Christian Rack, Marc Latoschik, James F. O’Brien
SePAR 2024

Truth in Motion: The Unprecedented Risks and Opportunities of Extended Reality Motion Data
Vivek Nair, Louis Rosenberg, James F. O’Brien, Dawn Song
IEEE S&P

Deep Motion Masking for Secure, Usable, and Scalable Real-Time Anonymization of Ecological Virtual Reality Motion Data
Vivek Nair, Wenbo Guo, James F. O’Brien, Louis Rosenberg, Dawn Song
IEEE VR3D

Inferring Private Personal Attributes of Virtual Reality Users from Ecologically Valid Head and Hand Motion Data
Vivek Nair, Christian Rack, Wenbo Guo, Rui Wang, Shuixian Li, Brandon Huang, Atticus Cull, James F. O’Brien, Marc Latoschik, Louis Rosenberg, Dawn Song
IEEE VR3D

Berkeley Open Extended Reality Recordings 2023 (BOXRR-23): 4.7 Million Motion Capture Recordings from 105,000 XR Users
Vivek Nair, Wenbo Guo, Rui Wang, James F. O’Brien, Louis Rosenberg, Dawn Song
IEEE VR 2024

Shankar Sastry

C.-Y. Chiu and S. Sastry. Parameter Estimation in Optimal Tolling for Traffic Networks Under the Markovian Traffic Equilibrium. (accepted at) 2024 American Control Conference (ACC), Toronto, Canada, 2024. [PDF]

Pan-Yang Su, Chinmay Maheshwari, Victoria Tuck, Shankar Sastry. Incentive-Compatible Vertiport Reservation in Advanced Air Mobility: An Auction-Based Approach. arXiv:2403.18166. March 2024. [PDF]

Kudva, Sukanya, Kshitij Kulkarni, Chinmay Maheshwari, Anil Aswani, and Shankar Sastry. “Understanding the Impact of Coalitions between EV Charging Stations.” arXiv preprint arXiv:2404.03919 (2024). [PDF]

Maheshwari, Chinmay, Kshitij Kulkarni, Druv Pai, Jiarui Yang, Manxi Wu, and Shankar Sastry. “Congestion Pricing for Efficiency and Equity: Theory and Applications to the San Francisco Bay Area.” arXiv preprint arXiv:2401.16844 (2024). [PDF]

2023

Luisa Caldas

Zhuang X., Ju Y., Yang, A. and Caldas L., 2023, Synthesis and Generation for 3D Architecture Volume with Generative Modelling. International Journal of Architecture Computing, Special volume, AI, Architecture, Accessibility, & Data Justice, April 2023. DOI: 10.1177/14780771231168233

Sathyanarayanan, H., Caldas, L., 2023, Co-Designing with Children: Innovating Patient Engagement and Participation in Pediatric Healthcare Design Research with Immersive Technology and Affective Interactions. Academy Journal No.24, August 2023, Academy of Architecture for Health, American Institute of Architects

Bailey, E., Caldas, L., 2023, Operative Generative Design Using Non-Dominated Sorting Genetic Algorithm (NSGA-II), 155(4), July 2023, Automation in Construction, Elsevier. DOI: 10.1016/j.autcon.2023.105026

Zani, A.; Speroni, A.; Mainini, A.; Zinzi, M.; Caldas, L.; Poli, T., 2023, Customized shading solutions for complex building façades: the potential of an innovative cement-textile composite material through a performance-based generative design, Construction Innovation: Information, Process, Management, Vol. 23, Emerald, DOI 10.1108/CI-01-2023-0014

Sathyanarayanan, H., Caldas, L., 2023, Patient-Centered Design in Pediatric Healthcare Buildings Using Immersive Virtual Environments, 7th Annual Virtual Reality and Healthcare Global Symposium, University of Pennsylvania School of Medicine, Philadelphia, March 3-5, 2023

Bjoern Hartmann

Interactive Flexible Style Transfer for Vector Graphics
Jeremy Warner, Kyu Won Kim, Björn Hartmann
UIST 2023
We present VST, Vector Style Transfer, a novel design tool for flexibly transferring visual styles between vector graphics.
full paper local pdf, bibtex, doi, project page, show video

Dual Body Bimanual Coordination in Immersive Environments
James Smith, Xinyun Cao, Adolfo G. Ramirez-Aristizabal, Björn Hartmann
DIS 2023
We investigate peoples abilities to perform coordinated bimanual selection and handoff tasks between a first-person and third-person body in VR through a user study with 19 participants.
full paper local pdf, bibtex, doi, show video

NFT Art World: The Influence of Decentralized Systems on the Development of Novel Online Creative Communities and Cooperative Practices
Shm Garanganao Almeda, Björn Hartmann
DIS 2023
Interviews with 16 creatives utilizing NFTs reveal unique artistic subcultures, philosophies, and interactions. We observe unique qualities of decentralized distribution platforms and identify patterns of activity comparable to those of traditional art worlds. We identify how aspects of these systems might subvert, or replicate, existing systems of power, value, and access.
full paper local pdf, bibtex, doi

Herding AI Cats: Lessons from Designing a Chatbot by Prompting GPT-3
JD Zamfirescu-Pereira, Heather Wei, Amy Xiao, Kitty Gu, Grance Jung, Matthew G Lee, Bjoern Hartmann, Qian Yang
DIS 2023
This paper describes a case study of an attempt to design a robust chatbot by prompting GPT-3. It unpacks prompting’s fickleness and its impact on UX design processes, and discusses implications for LLM-based design methods and tools.
full paper local pdf, bibtex, doi

VR or Not? Investigating Interface Type and User Strategies for Interactive Design Space Exploration
Ananya Nandy, James Smith, Nicholas Jennings, Michael Kuniavsky, Björn Hartmann, and Kosa Goucher-Lambert
ICED 2023
We investigate strategies that emerge when people explore a large design space within either a non-immersive (2D) or immersive (VR) interface. Results from a 28 participant user study show that the interfaces differ in perceptions of enabling breadth or depth of exploration holistically, with preference towards 2D interfaces to compare options, and VR to understand single designs.
full paper local pdf, bibtex, doi

Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts
JD Zamfirescu-Pereira, Richmond Y Wong, Björn Hartmann, and Qian Yang
CHI 2023
We explore whether non-AI-experts can successfully engage in end-user prompt engineering using a design probe – a prototype LLM-based chatbot design tool supporting development and systematic evaluation of prompting strategies. Our probe participants explored prompt designs opportunistically, not systematically, and struggled in ways echoing end-user programming systems and interactive machine learning systems.
full paper local pdf, bibtex, doi

SlideSpecs: Automatic and Interactive Presentation Feedback Collation
Jeremy Warner, Amy Pavel, Tonya Nguyen, Maneesh Agrawala, Björn Hartmann
IUI 2023
We present a tool to collate and contextualize both text and verbal feedback on slide presentations.
full paper local pdf, bibtex, doi

Ren Ng

H. K. Doyle, S. R. Herbeck, A. E. Boehm, J. E. Vanston, R. Ng, W. S. Tuten, and A. Roorda, “Boosting 2-photon vision with adaptive optics,” Journal of Vision, vol. 23, no. 12, pp. 4–4, Sep. 2023.

James O’ Brien

Unique Identification of 50,000+ Virtual Reality Users from Head and Hand Motion Data
Vivek Nair, Wenbo Guo, Justus Mattern, Rui Wang, James F. O’Brien, Louis Rosenberg, Dawn Song
USENIX Security 23

Exploring the Privacy Risks of Adversarial VR Game Design
Vivek Nair, Gonzalo Munilla Garrido, Dawn Song, James F. O’Brien
PoPETS 2023

KBody: Balanced monocular whole-body estimation
Nikolaos Zioulis, James F. O’Brien
CVFAD 2023

KBody: Towards general, robust, and aligned monocular whole-body estimation
Nikolaos Zioulis, James F. O’Brien
RHOBIN 2023

Results of the 2023 Census of Beat Saber Users: Virtual Reality Gaming Population Insights and Factors Affecting Virtual Reality E-Sports Performance
Vivek Nair, Viktor Radulov, James F. O’Brien
Survey

Shankar Sastry

Xin Guo, Xinyu Li, Chinmay Maheshwari, Shankar Sastry, Manxi Wu. Markov α-Potential Games: Equilibrium Approximation and Regret Analysis. arXiv:2305.12553. May 2023. [PDF]

Michael Psenka, DruvPai, Vishal Raman, Shankar Sastry, Yi Ma. Representation Learning via Manifold Flattening and Reconstruction. arXiv:2305.01777. May 2023. [PDF]

Chih-Yuan Chiu, ChinmayMaheshwari, Pan-Yang Su, Shankar Sastry. Arc-based Traffic Assignment: Equilibrium Characterization and Learning. arXiv:2304.04705. April 2023. [PDF]

C.-Y. Chiu, K. Kulkarni and S. Sastry. Towards Dynamic Causal Discovery with Rare Events: A Nonparametric Conditional Independence Test. 2023 62nd IEEE Conference on Decision and Control (CDC), Singapore, Singapore, 2023, pp. 7610-7616, doi: 10.1109/CDC49753.2023.10383747. [PDF]

C.-Y. Chiu, C. Maheshwari, P. -Y. Su and S. Sastry. Dynamic Tolling in Arc-based Traffic Assignment Models. 2023 59th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2023, pp. 1-8, doi: 10.1109/Allerton58177.2023.10313516. [PDF]

2022

Luisa Caldas

Zhuang X. and Caldas L., 2022, Prediction of Ventilation Performance in Urban Area with CFD Simulation and Conditional Generative Adversarial Networks. Proceedings of 5th International Conference on Building Energy and Environment, Montreal, Canada, January 2022

Bjoern Hartmann

Computational Support for Multiplicity in Hierarchical Electronics Design
Richard Lin, Rohit Ramesh, Prabal Dutta, Bjoern Hartmann, Ankur Mehta
SCF 2022
In this work, we explore two extensions of a hierarchical design model for electronics to support two types of multiplicity: scalable blocks and cross-hierarchy packing.
full paper local pdf, bibtex, doi

Concept-Annotated Examples for Library Comparison
Litao Yan, Miryung Kim, Bjoern Hartmann, Tianyi Zhang, Elena Glassman
UIST 2022
We designed a novel interactive interface, ParaLib, and used it as a technical probe to explore to what extent many side-by-side concepted-annotated examples can facilitate the library comparison and selection process.
full paper local pdf, bibtex, doi

Modeling and Influencing Human Attentiveness in Autonomy-to-Human Perception Hand-offs
Yash Vardhan Pant, Balasaravanan Thoravi Kumaravel, Ameesh Shah, Erin Kraemer, Marcell Vazquez-Chanlatte, Kshitij Kulkarni, Bjoern Hartmann, Sanjit A Seshia
IEEE ITSC 2022
In this paper, we consider the perception hand-off problem, which brings the driver into the loop when the perception module of an Autonomous Vehicle (AV) is uncertain about the environment.
full paper local pdf, bibtex, doi

Predicting and Explaining Mobile UI Tappability with Vision Modeling and Saliency Analysis
Eldon Schoop, Xin Zhou, Gang Li, Zhourong Chen, Bjoern Hartmann, Yang Li
CHI 2022
We contribute a novel system that models the perceived tappability of mobile UI elements with a vision-based deep neural network and helps provide design insights with dataset-level and instance-level explanations of model predictions.
full paper local pdf, bibtex, doi, project page

Yi Ma

On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence,
Yi Ma and Doris Tsao and Heung-Yeung Shum, arXiv:2207.04630, FITEE, 2022.

Pursuit of a Discriminative Representation for Multiple Subspaces via Sequential Games,
Druv Pai, Michael Psenka, Chih-Yuan Chiu, Manxi Wu, Edgar Dobriban, Yi Ma, arXiv:2206.09120, 2022

Incremental Learning of Structured Memory via Closed-Loop Transcription,
Shengbang Tong, Xili Dai, Ziyang Wu, Mingyang Li, Brent Yi, and Yi Ma, arXiv:2202.05411, 2022.

Predicting Out-of-Distribution Error with the Projection Norm,
Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma and Jacob Steinhardt, arXiv:2202.05834, ICML, 2022.

CTRL: Closed-Loop Transcription to an LDR via Minimaxing Rate Reduction,
Xili Dai, Shengbang Tong, Mingyang Li, Ziyang Wu, et. al. and Yi Ma, first released as arXiv:2111.06636 and published on Entropy, March 2022.

ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction,
Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, and Yi Ma, arXiv:2104.10446, JMLR, 2022.

On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence,
Yi Ma and Doris Tsao and Heung-Yeung Shum, arXiv:2207.04630, FITEE, 2022.

Fully Convolutional Line Parsing,
Xili Dai, Xiaojun Yuan, Haigang Gong, and Yi Ma, arXiv:2104.11207, Neurocomputing, 2022.

ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction,
Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, and Yi Ma, arXiv:2104.10446, Journal of Machine Learning Research (JMLR), 2022.

Computational Benefits of Intermediate Rewards for Hierarchical Planning,
Yuexiang Zhai, Christina Baek, Zhengyuan Zhou, Jiantao Jiao, and Yi Ma, arXiv:2107.03961, Journal of Artificial Intelligence Research, 2022.

In-Hand Object Rotation via Rapid Motor Adaptation,
Haozhi Qi, Ashish Kumar, Roberto Calandra, Yi Ma, and Jitendra Malik, CoRL 2022.

Revisiting Sparse Convolutional Model for Visual Recognition,
Xili Dai, Mingyang Li, Pengyuan Zhai, Shengbang Tong, Xingjian Gao, Shao-Lun Huang, Zhihui Zhu, Chong You, and Yi Ma, NeurIPS 2022.

BooNTK: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels,
Yaodong Yu, Alexander Wei, Sai Praneeth Karimireddy, Yi Ma, and Michael Jordan, NeurIPS 2022.

Robust Calibration with Multi-domain Temperature Scaling,
Yaodong Yu, Stephen Bates, Yi Ma, and Michael Jordan, NeurIPS 2022.

Predicting Out-of-Distribution Error with the Projection Norm,
Yaodong Yu, Zitong Yang, Alexander Wei, Yi Ma and Jacob Steinhardt, arXiv:2202.05834, ICML, 2022.

Efficient Maximal Coding Rate Reduction by Variational Form,
Christina Baek, Ziyang Wu, Ryan Chan, Tianjiao Ding, Yi Ma, Benjamin Haeffele, CVPR, 2022.

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging,
Chris Junchi Li, Yaodong Yu, Nicolas Loizou, Gauthier Gidel, Yi Ma, Nicolas Le Roux, Michael I. Jordan arXiv:2107.00464, AISTATS 2022

Shankar Sastry

Amay Saxena, Chih-Yuan Chiu, Joseph Menke, Ritika Shrivastava, Shankar Sastry. Simultaneous Localization and Mapping: Through the Lens of Nonlinear Optimization. IEEE Robotics and Automation Letters ( Volume: 7, Issue: 3, July 2022). [PDF]

Sanjit A Seshia, Dorsa Sadigh, S Shankar Sastry. Toward verified artificial intelligence. Communications of the ACM (Volume 65, Issue 7, Pages 46-55). [PDF]

Chinmay Maheshwari, Eric Mazumdar, Shankar Sastry. Decentralized, Communication-and Coordination-free Learning in Structured Matching Markets. arXiv preprint arXiv:2206.02344. [PDF]

Chinmay Maheshwari, Manxi Wu, Druv Pai, Shankar Sastry. Independent and Decentralized Learning in Markov Potential Games. arXiv preprint arXiv:2205.14590. [PDF]

Chinmay Maheshwari, Chih-Yuan Chiu, Eric Mazumdar, Shankar Sastry, Lillian Ratliff. Zeroth-Order Methods for Convex-Concave Min-max Problems: Applications to Decision-Dependent Risk Minimization. International Conference on Artificial Intelligence and Statistics (Pages 6702-6734). [PDF]

Michael Wu, Allen Yang, S Shankar Sastry. Full Stack engineering in Robot Open Autonomous Racing. [PDF]

Chinmay Maheshwari, Kshitij Kulkarni, Manxi Wu, Shankar Sastry. Inducing Social Optimality in Games via Adaptive Incentive Design. arXiv preprint arXiv:2204.05507. [PDF]

Tyler Westenbroek, Anand Siththaranjan, Mohsin Sarwari, Claire J Tomlin, Shankar S Sastry. On the Computational Consequences of Cost Function Design in Nonlinear Optimal Control. arXiv preprint arXiv:2204.01986. [PDF]

2021

Luisa Caldas

Keshavarzi, M., Afolabi, O., Caldas, L., A.Y. Yang. A Zakhor, 2021, “GenScan: A Generative Method for Populating Parametric 3D Scan Datasets”, Proceedings of CAADRIA ’21, March 2021, Hong Kong

Keshavarzi, M., Caldas, L., Santos, L., 2021, “RadVR: A 6DOF Virtual Reality Daylighting Analysis Tool.” Automation in Construction 125(4) DOI:10.1016/j.autcon.2021.103623

Bjoern Hartmann

Interactive Mixed-Dimensional Media for Cross-Dimensional Collaboration in Mixed Reality Environments
Balasaravanan Thoravi Kumaravel and Björn Hartmann
Frontiers in Virtual Reality 2021
To address asymmetries in Mixed Reality environments, we introduce Interactive Mixed-Dimensional Media. In these media, the visual representation of information streams can be changed between 2D and 3D. Different representations can be chosen automatically, based on context, or through associated interaction techniques that give users control over exploring spatial, temporal, and dimensional levels of detail.
journal paper local pdf, bibtex, doi

Weaving Schematics and Code: Interactive Visual Editing for Hardware Description Languages
Richard Lin, Rohit Ramesh, Nikhil Jain, Josephine Koe, Ryan Nuqui, Prabal Dutta, Bjoern Hartmann
UIST 2021
In many engineering disciplines such as circuit board, chip, and mechanical design, a hardware description language (HDL) approach provides important benefits over direct manipulation interfaces by supporting concepts like abstraction and generator meta-programming. In this work, we investigate an IDE approach to provide a graphical editor for a board-level circuit design HDL.
full paper local pdf, bibtex, doi, project page

UMLAUT: Debugging Deep Learning Programs using Program Structure and Model Behavior
Eldon Schoop, Forrest Huang, Bjoern Hartmann
CHI 2021
In this work, we identify Deep Learning debugging heuristics and strategies used by experts, and we categorize the types of errors novices run into when writing ML code. We then describe opportunities where tools could help novices. Umlaut checks DL program structure and model behavior against these heuristics; provides human-readable error messages to users; and annotates erroneous model output to facilitate error correction.
full paper local pdf, bibtex, doi, project page

Multi-level Correspondence via Graph Kernels for Editing Vector Graphics Designs
Hijung V. Shin, Jeremy Warner, Bjoern Hartmann, Celso Gomes, Holger Winnemoeller, and Wilmot Li
GI 2021
Graphic designs often contain repeating sets of elements with a similar structure. We introduce an algorithm that automatically computes this shared structure which enables graphical edits to be transferred from a set of source elements to multiple targets. For example, designers may want to propagate isolated edits to element attributes, apply nested layout adjustments, or transfer edits across different designs.
full paper local pdf

Yi Ma

Learning and Meshing from Deep Implicit Surface Networks Using an Efficient Implementation of Analytic Marching,
Jiabao Lei, Kui Jia, and Yi Ma, arXiv:2106.10031, IEEE Transactions to Pattern Analysis and Machine Intelligence, 2021.

Towards Unified Acceleration of High-Order Algorithms under Holder Continuity and Uniform Convexity,
Chaobing Song and Yi Ma, arXiv:1906.00582 (posted June 2019), SIAM Journal on Optimization (SIOPT), 2021.

Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective,
Shichang Tang, Xu Zhou, Xuming He, and Yi Ma, International Conference on Pattern Recognition (ICPR), 2021.

Adversarial Robustness of Stabilized Neural ODEs Might be from Obfuscated Gradients,
Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, and Yuan Yao arXiv:2009.13145, final version to appear in Mathematical and Scientific Machine Learning (MSML), Switzerland, 2021.

Learning Long-term Visual Dynamics with Region Proposal Interaction Networks,
Haozhi Qi, Xiaolong Wang, Deepak Pathak, Yi Ma, and Jitendra Malik, arXiv:2008.02265, ICLR 2021.

Incremental Learning via Rate Reduction,
Ziyang Wu, Christina Baek, Chong You, and Yi Ma, arXiv version post at arXiv:2011.14593 in November 2020, CVPR 2021.

NeRD: Neural 3D Reflection Symmetry Detector,
Yichao Zhou, Sichen Liu, and Yi Ma, an older arXiv version at: arXiv:2006.10042, CVPR 2021.

Ren Ng

M. Tancik, B. Mildenhall, T. Wang, D. Schmidt, P. P. Srinivasan, J. T. Barron, and R. Ng, “Learned Initializations for Optimizing Coordinate-Based Neural Representations,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

Shankar Sastry

Tuck, Y. V. Pant, S. A. Seshia and S. S. Sastry, “DEC-LOS-RRT: Decentralized Path Planning for Multi-robot Systems with Line-of-sight Constrained Communication,” 2021 IEEE Conference on Control Technology and Applications (CCTA), 2021, pp. 103-110, doi: 10.1109/CCTA48906.2021.9659247. [PDF]

C. Maheshwari, K. Kulkarni, M. Wu and S.S. Sastry, 2021. Dynamic Tolling for Inducing Socially Optimal Traffic Loads. arXiv preprint arXiv:2110.08879. [PDF]

C. Maheshwari, C. Y. Chiu, E. Mazumdar, S.S. Sastry, L.J. Ratliff. Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization. arXiv preprint arXiv:2106.09082. 2021 Jun 16. [PDF]

P Dayani, N Orr, V Saran, N Hu, S Krishnaswamy, A Thomopoulos, E Wang, J Bae, E Zhang, D McPherson, J Menke, A Moran, B Quiter, A Yang, K Vetter (2021). Immersive Operation of a Semi-Autonomous Aerial Platform for Detecting and Mapping Radiation. IEEE Transactions on Nuclear Science, 68(12), 2702-2710. [PDF]

McPherson, D. L., Stocking, K. C., & Sastry, S. S. (2021, August). Maximum Likelihood Constraint Inference from Stochastic Demonstrations. In 2021 IEEE Conference on Control Technology and Applications (CCTA) (pp. 1208-1213). IEEE. [PDF]

McPherson, D. L., & Sastry, S. S. An Efficient Understandability Objective for Dynamic Optimal Control. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 986-992). IEEE. [PDF]

Tyler Westenbroek, Max Simchowitz, Michael I Jordan, S Shankar Sastry. On the stability of nonlinear receding horizon control: a geometric perspective. 2021 60th IEEE Conference on Decision and Control (CDC) (Pages 742-749). [PDF]

Tijana Zrnic, Eric Mazumdar, Shankar Sastry, Michael Jordan. Who Leads and Who Follows in Strategic Classification?.Advances in Neural Information Processing Systems (Volume 34, Pages 15257-15269). [PDF]

Mohammad Ranjbar, HL Nguyen, Nghi H Tran, Tutku Karacolak, Shivakumar Sastry, LD Nguyen. Energy efficiency of full-duplex cognitive radio in low-power regimes under imperfect spectrum sensing. Mobile Networks and Applications (Volume 25, Issue 4, Pages 1750-1764). [PDF]

Shankar Sastry, Forrest Laine, Claire Tomlin. Optimal control and the linear quadratic regulator. [PDF]

Tyler Westenbroek, Xiaobin Xiong, S Shankar Sastry, Aaron D Ames. Smooth approximations for hybrid optimal control problems with application to robotic walking. IFAC-PapersOnLine (Volume 55, Issue 5, Pages 181-186). [PDF]

2020

Feature Expansive Reward Learning: Rethinking Human Input
When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states…

Author(s): Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan.
Journal/Conference:
arxiv.org/abs/2006.13208

Learned Initializations for Optimizing Coordinate-Based Neural Representations
We propose applying standard meta-learning algorithms to learn the initial weight parameters for coordinate based neural representations based on the underlying class of signals being represented (e.g., images of faces or 3D models of chairs). Despite requiring only a minor change in implementation, using these learned initial weights enables faster convergence during optimization and can serve as a strong prior over the signal class being modeled, resulting in better generalization when only partial observations of a given signal are available…

Author(s): Matthew Tancik*, Ben Mildenhall*, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng.
Journal/Conference:
arxiv.org/abs/2012.02189

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x, y, z) and viewing direction (θ, φ)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays…

Author(s): Ben Mildenhall*, Pratul P. Srinivasan*, Matthew Tancik*, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng.
Journal/Conference: ECCV (2020) Oral – Best Paper Honorable Mention
arxiv.org/abs/2003.08934

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (NTK) literature, we show that a standard MLP fails to learn high frequencies both in theory and in practice. To overcome this spectral bias, we use a Fourier feature mapping to transform the effective NTK into a stationary kernel with a tunable bandwidth. We suggest an approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities.

Author(s): Matthew Tancik*, Ben Mildenhall*, Pratul P. Srinivasan*, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, Ren Ng.
Journal/Conference: NeurIPS (2020) Spotlight
arxiv.org/abs/2006.10739

GenScan: A Generative Method for Populating Parametric 3D Scan Datasets
The availability of rich 3D datasets corresponding to the geometrical complexity of the built environments is considered an ongoing challenge for 3D deep learning methodologies. To address this challenge, we introduce GenScan, a generative system that populates synthetic 3D scan datasets in a parametric fashion. The system takes an existing captured 3D scan as an input and outputs alternative variations of the building layout including walls, doors, and furniture with corresponding textures. GenScan is a fully automated system that can also be manually controlled by a user through an assigned user interface. Our proposed system utilizes a combination of a hybrid deep neural network and a parametrizer module to extract and transform elements of a given 3D scan….

Author(s): Mohammad Keshavarzi, Oladapo Afolabi, Luisa Caldas, Allen Y. Yang, Avideh Zakhor.
Journal/Conference: December 2020
arxiv.org/abs/2012.03998

Optimistic Dual Extrapolation for Non-monotone Variational Inequality
Author(s): Chaobing Song, Yichao Zhou, Zhengyuan Zhou, Yong Jiang, and Yi Ma.
Journal/Conference: NeurIPS, December 2020.

Voronoi Progressive Widening: Efficient Online Solvers for Continuous Space MDPs and POMDPs with Provably Optimal Components
Markov decision processes (MDPs) and partially observable MDPs (POMDPs) can effectively represent complex real-world decision and control problems. However, continuous space MDPs and POMDPs, i.e. those having continuous state, action and observation spaces, are extremely difficult to solve, and there are few online algorithms with convergence guarantees. This paper introduces Voronoi Progressive Widening (VPW), a general technique to modify tree search algorithms to effectively handle continuous or hybrid action spaces, and proposes

Author(s): Michael H. Lim, Claire J. Tomlin, Zachary N. Sunberg.
Journal/Conference: December 2020.
arxiv.org/abs/2012.10140

Stochastic Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization
In this paper, we introduce a simplified and unified method for finite-sum convex optimization, named \emph{Stochastic Variance Reduction via Accelerated Dual Averaging (SVR-ADA)}. In the nonstrongly convex and smooth setting, SVR-ADA can attain an O(1n)-accurate solution in O(nloglogn) number of stochastic gradient evaluations, where n is the number of samples; meanwhile, SVR-ADA matches the lower bound…

Author(s): Chaobing Song, Yong Jiang, and Yi Ma.
Journal/Conference: NeurIPS, December 2020.
arXiv:2006.10281

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization
Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for robust low-rank matrix recovery from grossly corrupted measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a double over-parameterization for both…

Author(s): Chong You, Zhihui Zhu, Qing Qu, and Yi Ma.
Journal/Conference: NeurIPS (spotlight), December 2020.
arXiv:2006.08857

Dynamic Legged Manipulation of a Ball Through Multi-Contact Optimization
The feet of robots are typically used to design locomotion strategies, such as balancing, walking, and running. However, they also have great potential to perform manipulation tasks. In this paper, we propose a model predictive control (MPC) framework for a quadrupedal robot to dynamically balance on a ball and simultaneously manipulate it to follow various trajectories such as straight lines, sinusoids, circles and in-place turning…

Author(s): Chenyu Yang, Bike Zhang, Jun Zeng, Ayush Agrawal, Koushil Sreenath.
Journal/Conference:
arxiv.org/abs/2008.00191

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction (MCR2), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive…

Author(s): Yaodong Yu, Kwan Ho Ryan Chan, Chong You, Chaobing Song, and Yi Ma.
Journal/Conference: NeurIPS, December 2020.
arXiv:2006.08558

Trajectory Optimization for Nonlinear Multi-Agent Systems using Decentralized Learning Model Predictive Control
We present a decentralized minimum-time trajectory optimization scheme based on learning model predictive control for multi-agent systems with nonlinear decoupled dynamics and coupled state constraints. By performing the same task iteratively, data from previous task executions is used to construct and improve local time-varying safe sets and an approximate value function…

Author(s): Edward L. Zhu, Yvonne R. Stürz, Ugo Rosolia, Francesco Borrelli.
Journal/Conference: Conference on Decision and Control 2020
arxiv.org/abs/2004.01298

Formation and Reconfiguration of Tight Multi-Lane Platoons
Advances in vehicular communication technologies are expected to facilitate cooperative driving. Connected and Automated Vehicles (CAVs) are able to collaboratively plan and execute driving maneuvers by sharing their perceptual knowledge and future plans. In this paper, an architecture for autonomous navigation of tight multi-lane platoons travelling on public roads is presented. Using the proposed approach, CAVs are able to form single or multi-lane platoons of various geometrical configurations. They are able to reshape and adjust their configurations according to changes in the environment…

Author(s): Roya Firoozi, Xiaojing Zhang, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2003.08595

TransceiVR: Bridging Asymmetrical Communication Between VR Users and External Collaborators
Virtual Reality (VR) users often need to work with other users, who observe them outside of VR using an external display. Communication between them is difficult; the VR user cannot see the external user’s gestures, and the external user cannot see VR scene elements outside of the VR user’s view. We carried out formative interviews with experts to understand these asymmetrical interactions and identify their goals and challenges. From this, we identify high-level system design goals to facilitate asymmetrical interactions and a corresponding space of implementation approaches based on the level of programmatic access to a VR application. We present TransceiVR, a system that utilizes VR platform APIs to enable asymmetric communication interfaces for third-party applications without requiring source code access…

Author(s): Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, Bjoern Hartmann. 2020.
Conference: Will be presented at UIST’20.

Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Group
This paper considers the fundamental problem of learning a complete (orthogonal) dictionary from samples of sparsely generated signals. Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity. The recent ℓ1-minimization based methods do provide such guarantees but the associated algorithms recover the dictionary one column at a time. In this work, we propose a new formulation that maximizes the ℓ4-norm over the orthogonal group, to learn the entire dictionary….

Author(s): Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, and Yi Ma.
Journal/Conference: Journal of Machine Learning Research (JMLR), 2020.
arXiv:1906.02435

Understanding L4-based Dictionary Learning: Interpretation, Stability, and Robustness

Author(s): Yuexiang Zhai, Hermish Mehta, Zhengyuan Zhou, and Yi Ma.
Journal/Conference: International Conference on Learning Research (ICLR), 2020.

Staging energy sources to extend flight time of a multirotor UAV
Energy sources such as batteries do not decrease in mass after consumption, unlike combustion-based fuels. We present the concept of staging energy sources, i.e. consuming energy in stages and ejecting used stages, to progressively reduce the mass of aerial vehicles in-flight which reduces power consumption, and consequently increases flight time. A flight time vs. energy storage mass analysis is presented to show the endurance benefit of staging to multirotors. We consider two specific problems in discrete staging — optimal order of staging given a certain number of energy sources, and optimal partitioning of a given energy storage mass budget into a given number of stages….

Author(s): Karan P. Jain, Jerry Tang, Koushil Sreenath, Mark W. Mueller.
Journal/Conference: IROS 2020
arxiv.org/abs/2003.04290

Gaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with Uncertain Input Effects
This paper presents a method to design a min-norm Control Lyapunov Function (CLF)-based stabilizing controller for a control-affine system with uncertain dynamics using Gaussian Process (GP) regression. We propose a novel compound kernel that captures the control-affine nature of the problem, which permits the estimation of both state and input-dependent model uncertainty in a single GP regression problem. Furthermore, we provide probabilistic guarantees of convergence by the use of GP Upper Confidence Bound analysis and the formulation of a CLF-based stability chance constraint which can be incorporated in a min-norm optimization problem…

Author(s): Fernando Castañeda, Jason J. Choi, Bike Zhang, Claire J. Tomlin, Koushil Sreenath.
Journal/Conference:
arxiv.org/abs/2011.07183

Collision Avoidance in Tightly-Constrained Environments without Coordination: a Hierarchical Control Approach
We present a hierarchical control approach for maneuvering an autonomous vehicle (AV) in a tightly-constrained environment where other moving AVs and/or human driven vehicles are present. A two-level hierarchy is proposed: a high-level data-driven strategy predictor and a lower-level model-based feedback controller. The strategy predictor maps a high-dimensional environment encoding into a set of high-level strategies…

Author(s): Xu Shen, Edward L. Zhu, Yvonne R. Stürz, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2011.00413

DeepReach: A Deep Learning Approach to High-Dimensional Reachability
Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical control systems. Its advantages include compatibility with general nonlinear system dynamics, formal treatment of bounded disturbances, and the ability to deal with state and input constraints. However, it involves solving a PDE, whose computational and memory complexity scales exponentially…

Author(s): Somil Bansal, Claire Tomlin.
Journal/Conference: November, 2020.
arxiv.org/abs/2011.02082

Multi-Hypothesis Interactions in Game-Theoretic Motion Planning
We present a novel method for handling uncertainty about the intentions of non-ego players in dynamic games, with application to motion planning for autonomous vehicles. Equilibria in these games explicitly account for interaction among other agents in the environment, such as drivers and pedestrians. Our method models the uncertainty about the intention of other agents by constructing multiple hypotheses about the objectives and constraints of other agents in the scene…

Author(s): Forrest Laine, David Fridovich-Keil, Chih-Yuan Chiu, Claire Tomlin.
Journal/Conference: November, 2020.
arxiv.org/abs/2011.06047

Testing for Typicality with Respect to an Ensemble of Learned Distributions
Methods of performing anomaly detection on high-dimensional data sets are needed, since algorithms which are trained on data are only expected to perform well on data that is similar to the training data. There are theoretical results on the ability to detect if a population of data is likely to come from a known base distribution, which is known as the goodness-of-fit problem. One-sample approaches to this problem offer significant computational advantages for online testing, but require knowing a model of the base distribution. The ability to correctly reject anomalous data in this setting hinges on the accuracy of the model of the base distribution…

Author(s): Forrest Laine, Claire Tomlin.
Journal/Conference: November, 2020.
arxiv.org/abs/2011.06041

Encoding Defensive Driving as a Dynamic Nash Game
Robots deployed in real-world environments should operate safely in a robust manner. In scenarios where an “ego” agent navigates in an environment with multiple other “non-ego” agents, two modes of safety are commonly proposed: adversarial robustness and probabilistic constraint satisfaction. However, while the former is generally computationally-intractable and leads to overconservative solutions, the latter typically relies on strong distributional assumptions and ignores strategic coupling between agents. To avoid these drawbacks, we present a novel formulation of robustness within the framework of general sum dynamic game theory, modeled on defensive driving…

Author(s): Chih-Yuan Chiu, David Fridovich-Keil, Claire J. Tomlin.
Journal/Conference: November, 2020.
arxiv.org/abs/2011.04815

Approximate Solutions to a Class of Reachability Games
In this paper, we present a method for finding approximate Nash equilibria in a broad class of reachability games. These games are often used to formulate both collision avoidance and goal satisfaction. Our method is computationally efficient, running in real-time for scenarios involving multiple players and more than ten state dimensions. The proposed approach forms a family of increasingly exact approximations to the original game. Our results characterize the quality of these approximations and show operation in a receding horizon, minimally-invasive control context. Additionally, as a special case, our method reduces to local optimization in the single-player (optimal control) setting, for which a wide variety of efficient algorithms exist.

Author(s): David Fridovich-Keil, Claire J. Tomlin.
Journal/Conference: November, 2020.
arxiv.org/abs/2011.00601

Incremental Learning via Rate Reduction
Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes. The fundamental roadblock faced by deep learning methods is that deep learning models are optimized as “black boxes,” making it difficult to properly adjust the model parameters to preserve knowledge about previously seen data. To overcome the problem of catastrophic forgetting…

Author(s): Ziyang Wu, Christina Baek, Chong You, and Yi Ma.
Journal/Conference: November, 2020.
arXiv:2011.14593

Comments on Efficient Singular Value Thresholding Computation
We discuss how to evaluate the proximal operator of a convex and increasing function of a nuclear norm, which forms the key computational step in several first-order optimization algorithms such as (accelerated) proximal gradient descent and ADMM. Various special cases of the problem arise in low-rank matrix completion, dropout training in deep learning and high-order low-rank tensor recovery, although they have all been solved on a case-by-case basis. We provide an unified and efficiently computable procedure for solving this problem…

Author(s): Zhengyuan Zhou and Yi Ma.
Journal/Conference: November, 2020.
arXiv:2011.06710

Safety-Critical Model Predictive Control with Discrete-Time Control Barrier Function
The optimal performance of robotic systems is usually achieved near the limit of state and input bounds. Model predictive control (MPC) is a prevalent strategy to handle these operational constraints, however, safety still remains an open challenge for MPC as it needs to guarantee that the system stays within an invariant set. In order to obtain safe optimal performance in the context of set invariance, we present a safety-critical model predictive control strategy utilizing discrete-time control barrier functions…

Author(s): Jun Zeng, Bike Zhang, Koushil Sreenath.
Journal/Conference:
arxiv.org/abs/2007.11718

Expert Selection in High-Dimensional Markov Decision Processes
In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings. Our method takes a set of candidate expert policies and switches between them to rapidly identify the best performing expert using a variant of the classical upper confidence bound algorithm, thus ensuring low regret in the overall performance of the system. This is useful in applications where several expert policies may be available, and one needs to be selected at run-time for the underlying environment.

Author(s): Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry.
Journal/Conference: In proceedings of the 59th IEEE Conference on Decision and Control 2020
arxiv.org/abs/2010.15599

Deep Networks from the Principle of Rate Reduction
This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of learned features naturally leads to a multi-layer deep network, one iteration per layer. The layered architectures, linear…

Author(s): Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, and Yi Ma.
Journal/Conference: October, 2020.
arXiv:2010.14765

Control of Unknown Nonlinear Systems with Linear Time-Varying MPC
We present a Model Predictive Control (MPC) strategy for unknown input-affine nonlinear dynamical systems. A non-parametric method is used to estimate the nonlinear dynamics from observed data. The estimated nonlinear dynamics are then linearized over time varying regions of the state space to construct an Affine Time Varying (ATV) model…

Author(s): Dimitris Papadimitriou, Ugo Rosolia, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2004.03041

Animated Cassie: A Dynamic Relatable Robotic Character
Creating robots with emotional personalities will transform the usability of robots in the real world. As previous emotive social robots are mostly based on statically stable robots whose mobility is limited, this paper develops an animation to real world pipeline that enables dynamic bipedal robots that can twist, wiggle, and walk to behave with emotions. First, an animation method is introduced to design emotive motions for the virtual robot character. Second, a dynamics optimizer is used to convert the animated motion to dynamically feasible motion…

Author(s): Zhongyu Li, Christine Cummings, Koushil Sreenath.
Journal/Conference: IROS 2020
arxiv.org/abs/2009.02846

Adversarial Robustness of Stabilized Neural ODEs Might be from Obfuscated Gradients
In this paper we introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box adversarial attacks even when the network is trained naturally. For most existing defense methods withstanding strong white-box attacks, to improve robustness of neural networks, they need to be trained adversarially, hence have to strike a trade-off between natural accuracy and adversarial robustness. Inspired by…

Author(s): Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, and Yuan Yao.
Journal/Conference: September, 2020.
arXiv:2009.13145

Robust MPC for Linear Systems with Parametric and Additive Uncertainty: A Novel Constraint Tightening Approach
We propose a novel approach to design a robust Model Predictive Controller (MPC) for constrained uncertain linear systems. The system dynamics matrices are not known exactly, leading to parametric model mismatch. We also consider the presence of an additive disturbance. Set based bounds for each component of the model uncertainty are assumed to be known…

Author(s): Monimoy Bujarbaruah, Ugo Rosolia, Yvonne R Stürz, Xiaojing Zhang, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2007.00930

Learning to Satisfy Unknown Constraints in Iterative MPC
We propose a control design method for linear time-invariant systems that iteratively learns to satisfy unknown polyhedral state constraints. At each iteration of a repetitive task, the method constructs an estimate of the unknown environment constraints using collected closed-loop trajectory data…

Author(s): Monimoy Bujarbaruah, Charlott Vallon, Francesco Borrelli.
Journal/Conference: IEEE-CDC 2020
arxiv.org/abs/2006.05054

SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors
Spatial computing experiences are constrained by the real-world surroundings of the user. In such experiences, augmenting virtual objects to existing scenes require a contextual approach, where geometrical conflicts are avoided, and functional and plausible relationships to other objects are maintained in the target environment. Yet, due to the complexity and diversity of user environments, automatically calculating ideal positions of virtual content that is adaptive to the context of the scene is considered a challenging task. Motivated by this problem, in this paper we introduce SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes….

Author(s): Mohammad Keshavarzi, Aakash Parikh, Xiyu Zhai, Melody Mao, Luisa Caldas, Allen Y. Yang.
Journal/Conference: September 2020
arxiv.org/abs/2009.12395

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures
We present HoliCity, a city-scale 3D dataset with rich structural information. Currently, this dataset has 6,300 real-world panoramas of resolution 13312×6656 that are accurately aligned with the CAD model of downtown London with an area of more than 20 km2, in which the median reprojection error of the alignment of an average image is less than half a degree. This dataset aims…

Author(s): Yichao Zhou, Jingwei Huang, Xili Dai, Linjie Luo, Zhili Chen, and Yi Ma.
Journal/Conference: August, 2020.
arXiv:2008.03286

Learning Long-term Visual Dynamics with Region Proposal Interaction Networks

Learning long-term dynamics models is the key to understanding physical common sense. Most existing approaches on learning dynamics from visual input sidestep long-term predictions by resorting to rapid re-planning with short-term models. This not only requires such models to be super accurate but also limits them only to tasks where an agent can continuously obtain feedback and take action at each step until completion. In this paper, we aim to…

Author(s): Haozhi Qi, Xiaolong Wang, Deepak Pathak, Yi Ma, and Jitendra Malik.
Journal/Conference: August, 2020.
arXiv:2008.02265

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions
In this paper, the issue of model uncertainty in safety-critical control is addressed with a data-driven approach. For this purpose, we utilize the structure of an input-ouput linearization controller based on a nominal model along with a Control Barrier Function and Control Lyapunov Function based Quadratic Program (CBF-CLF-QP). Specifically, we propose a novel reinforcement learning framework which learns the model uncertainty present in the CBF and CLF constraints…

Author(s): Jason Choi, Fernando Castañeda, Claire J. Tomlin, Koushil Sreenath.
Journal/Conference:
arxiv.org/abs/2004.07584

Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
The classical bias-variance trade-off predicts that bias decreases and variance increase with model complexity, leading to a U-shaped risk curve. Recent work calls this into question for neural networks and other over-parameterized models, for which it is often observed that larger models generalize better. We provide a simple explanation for this by measuring the bias and variance of neural networks: while the bias is monotonically decreasing as in the classical theory, the variance is unimodal or bell-shaped: it increases then decreases with the width of the network…

Author(s): Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, and Yi Ma.
Journal/Conference: International Conference on Machine Learning (ICML), June 2020.
arXiv:2002.11328 [cs.LG]

Distributed Learning Model Predictive Control for Linear Systems
This paper presents a distributed learning model predictive control (DLMPC) scheme for distributed linear time invariant systems with coupled dynamics and state constraints. The proposed solution method is based on an online distributed optimization scheme with nearest-neighbor communication. If the control task is iterative and data from previous feasible iterations are available, local data are exploited by the subsystems in order to construct the local terminal set and terminal cost, which guarantee recursive feasibility and asymptotic stability, as well as performance improvement over iterations…

Author(s): Yvonne R. Stürz, Edward L. Zhu, Ugo Rosolia, Karl H. Johansson, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2006.13406

A Distributed Multi-Robot Coordination Algorithm for Navigation in Tight Environments
This work presents a distributed method for multi-robot coordination based on nonlinear model predictive control (NMPC) and dual decomposition. Our approach allows the robots to coordinate in tight spaces (e.g., highway lanes, parking lots, warehouses, canals, etc.) by using a polytopic description of each robot’s shape and formulating the collision avoidance as a dual optimization problem. Our method accommodates heterogeneous teams of robots (i.e., robots with different polytopic shapes and dynamic models can be part of the same team) and can be used to avoid collisions in…

Author(s): Roya Firoozi, Laura Ferranti, Xiaojing Zhang, Sebastian Nejadnik, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2006.11492

Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction

3D reconstruction from a single RGB image is a challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability. In this work, we focus on object-level 3D reconstruction and present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry…

Author(s): Yichao Zhou, Sichen Liu, and Yi Ma.
Journal/Conference: June 2020.
arXiv:2006.10042

Deep Isometric Learning for Visual Recognition
Initialization, normalization, and skip connections are believed to be three indispensable techniques for training very deep convolutional neural networks and obtaining state-of-the-art performance. This paper shows that deep vanilla ConvNets without normalization nor skip connections can also be trained to achieve surprisingly good performance on standard image recognition benchmarks. This is achieved by enforcing the convolution kernels to be near isometric during initialization and training, as well as by using a variant of ReLU that is shifted towards being isometric….

Author(s): Haozhi Qi, Chong You, Xiaolong Wang, Yi Ma, and Jitendra Malik.
Journal/Conference: International Conference on Machine Learning (ICML), June 2020.
arXiv:2006.16992

Data-Driven Hierarchical Predictive Learning in Unknown Environments
We propose a hierarchical learning architecture for predictive control in unknown environments. We consider a constrained nonlinear dynamical system and assume the availability of state-input trajectories solving control tasks in different environments. A parameterized environment model generates state constraints specific to each task, which are satisfied by the stored trajectories. Our goal is to find a feasible trajectory for a new task in an unknown environment…

Author(s): Charlott Vallon, Francesco Borrellin T. Barron, Ren Ng.
Journal/Conference:
arxiv.org/abs/2005.05948

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning
While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent’s policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly represented by a simple reward combined with a set of hard constraints. In this setting, the agent is attempting to maximize cumulative rewards subject to these given constraints on their behavior. We reformulate the problem of IRL on Markov Decision Processes (MDPs) such that…

Author(s): Dexter R.R. Scobee and S. Shankar Sastry.
Journal/Conference: International Conference on Learning Representations (ICLR), 2020

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques…

Author(s): Fernando Castañeda, Mathias Wulfman, Ayush Agrawal, Tyler Westenbroek, Claire J. Tomlin, S. Shankar Sastry, Koushil Sreenath.
Journal/Conference: Learning for Dynamics and Control (L4DC) 2020 Conference
arxiv.org/abs/2004.07276

Optimal Robust Safety-Critical Control for Dynamic Robotics
We present a novel method of optimal robust control through quadratic programs that offers tracking stability while subject to input and state-based constraints as well as safety-critical constraints for nonlinear dynamical robotic systems in the presence of model uncertainty…

Author(s):Quan Nguyen, Koushil Sreenath.
Journal/Conference:
arxiv.org/abs/2005.07284

Task Decomposition for MPC: A Computationally Efficient Approach for Linear Time-Varying Systems
A Task Decomposition method for iterative learning Model Predictive Control (TDMPC) for linear time-varying systems is presented. We consider the availability of state-input trajectories which solve an original task T1, and design a feasible MPC policy for a new task, T2, using stored data from T1…

Author(s): Charlott Vallon, Francesco Borrelli.
Journal/Conference:
arxiv.org/abs/2005.01673

Eyes-Closed Safety Kernels: Safety for Autonomous Systems Under Loss of Observability
A framework is presented for handling a potential loss of observability of a dynamical system in a provably-safe way. Inspired by the fragility of data-driven perception systems used by autonomous vehicles, we formulate the problem that arises when a sensing modality fails or is found to be untrustworthy during autonomous operation. We cast this problem as a differential game played between the dynamical system being controlled and the external system factor(s) for which observations are lost…

Author(s): Forrest Laine, Chiu-Yuan Chiu, Claire Tomlin.
Journal/Conference: Robotics: Science and Systems 2020
arxiv.org/abs/2005.07144

Inference-Based Strategy Alignment for General-Sum Differential Games
In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play may be characterized by one of many equilibrium concepts, e.g., a Nash equilibrium…

Author(s): Lasse Peters, David Fridovich-Keil, Claire J. Tomlin, Zachary N. Sunberg.
Journal/Conference:
arxiv.org/abs/2002.04354

Learning Min-norm Stabilizing Control Laws for Systems with Unknown Dynamics
This paper introduces a framework for learning a minimum-norm stabilizing controller for a system with unknown dynamics using model-free policy optimization methods. The approach begins by first designing a Control Lyapunov Function (CLF) for a (possibly inaccurate) dynamics model for the system, along with a function which specifies a minimum acceptable rate of energy dissipation for the CLF at different points in the state-space. Treating the energy dissipation condition as a constraint on the desired closed-loop behavior of the real-world system, we use penalty methods to formulate an unconstrained optimization problem over the parameters of a learned controller, which can be solved using model-free policy optimization algorithms using data collected from the plant…

Author(s): Tyler Westenbroek, Fernando Castañeda, Ayush Agrawal, S. Shankar Sastry, Koushil Sreenath.
Journal/Conference: April 21, 2020
arxiv.org/abs/2004.10331

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities…

Author(s): Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry.
Journal/Conference:
arxiv.org/abs/2004.02766

ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots
We investigate the problem of predicting driver behavior in parking lots, an environment which is less structured than typical road networks and features complex, interactive maneuvers in a compact space. Using the CARLA simulator, we develop a parking lot environment and collect a dataset of human parking maneuvers. We then study the impact of model complexity and feature information by comparing…

Author(s): Xu Shen, Ivo Batkovic, Vijay Govindarajan, Paolo Falcone, Trevor Darrell, Francesco Borrelli.
Journal/Conference: IEEE Intelligent Vehicles Symposium (IV) 2020
arxiv.org/abs/2004.10293

Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation
Recent advances in computer graphics and computer vision have found successful application of deep neural network models for 3D shapes based on signed distance functions (SDFs) that are useful for shape representation, retrieval, and completion. However, this approach has been limited by the need to have query shapes in the same canonical scale and pose as those observed during training, restricting its effectiveness on real world scenes. We present a formulation to overcome this issue by jointly estimating shape and similarity transform parameters. We conduct experiments to demonstrate the effectiveness of this formulation on synthetic and real datasets and report favorable comparisons to the state of the art. Finally, we also emphasize the viability of this approach as a form of 3D model compression.

Author(s): Afolabi, Oladapo and Yang, Allen and Sastry, Shankar S.
Journal/Conference: Apr 20, 2020
arXiv:2004.09048

Adaptive Control for Linearizable Systems using On-Policy Reinforcement Learning
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities…

Author(s): Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin and S. Shankar Sastry.
Journal/Conference: Technichal Report, April 2020
arxiv.org/abs/2004.02766

Exponentially Stable First Order Control on Matrix Lie Groups
We present a novel first order controller for systems evolving on matrix Lie groups, a major use case of which is Cartesian velocity control on robot manipulators. This controller achieves global exponential trajectory tracking on a number of commonly used Lie groups including the Special Orthogonal Group SO(n), the Special Euclidean Group SE(n), and the General Linear Group over complex numbers GL(n, C). Additionally, this controller achieves local exponential trajectory tracking on all matrix Lie groups. We demonstrate the effectiveness of this controller in simulation on a number of different Lie groups as well as on hardware with a 7-DOF Sawyer robot arm.

Author(s): Valmik Prabhu, Amay Saxena, and S. Shankar Sastry.
Journal/Conference: April 1, 2020
arxiv.org/abs/2004.00239

Visual Navigation Among Humans with Optimal Control as a Supervisor
Real world navigation requires robots to operate in unfamiliar, dynamic environments, sharing spaces with humans. Navigating around humans is especially difficult because it requires predicting their future motion, which can be quite challenging. We propose a novel framework for navigation around humans which combines learning-based perception with model-based optimal control. Specifically, we train a Convolutional Neural Network (CNN)-based perception module which maps the robot’s visual inputs…

Author(s): Varun Tolani, Somil Bansal, Aleksandra Faust, Claire Tomlin.
Journal/Conference:
arxiv.org/abs/2003.09354

Output-Lifted Learning Model Predictive Control
We propose a computationally efficient Learning Model Predictive Control (LMPC) scheme for constrained optimal control of a class of nonlinear systems, performing iterative tasks. For the considered class of systems, we show how to use historical trajectory data to construct a convex value function approximation along with a convex safe set in a lifted space of virtual outputs…

Author(s): .Siddharth H. Nair, Ugo Rosolia, Francesco Borrelli
Journal/Conference:
arxiv.org/abs/2004.05173

Optimization and Manipulation of Contextual Mutual Spaces for Multi-User Virtual and Augmented Reality Interaction
Spatial computing experiences are physically constrained by the geometry and semantics of the local user environment. This limitation is elevated in remote multi-user interaction scenarios, where finding a common virtual ground physically accessible for all participants becomes challenging. Locating a common accessible virtual ground is difficult for the users themselves, particularly if they are not aware of the spatial properties of other participants. In this paper, we introduce a framework to generate an optimal mutual virtual space for a multi-user interaction setting where remote users’ room spaces can have different layout and sizes…

Author(s): Mohammad Keshavarzi, Allen Y. Yang, Woojin Ko, Luisa Caldas.
Journal/Conference: 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), February 2020
arxiv.org/abs/1910.05998

Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games
We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play in N–player general-sum linear quadratic games, a classic game setting which is recently emerging as a benchmark in the field of multi-agent learning. In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations…

Author(s): Eric Mazumdar, Lillian J. Ratliff, Micheal I. Jordan, S. Shankar Sastry.
Journal/Conference: AAMAS, 2020

LESS is More: Rethinking Probabilistic Models of Human Behavior
Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function…

Author(s): Andreea Bobu, Dexter R.R. Scobee, Jaime F. Fisac, S. Shankar Sastry, and Anca D. Dragan.
Journal/Conference: ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2020
dl.acm.org/doi/10.1145/3319502.3374811

2019

TutoriVR: A Video-Based Tutorial System for Design Applications in Virtual Reality
Virtual Reality painting is a form of 3D-painting done in a Virtual Reality (VR) space. Being a relatively new kind of art form, there is a growing interest within the creative practices community to learn it. Currently, most users learn using community posted 2D-videos on the internet, which are a screencast recording of the painting process by an instructor. While such an approach may suffice for teaching 2D-software tools, these videos by themselves fail in delivering crucial details that required by the user to understand actions in a VR space…

Author(s): Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, Bjoern Hartmann. 2019. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI ’19).
DOI: https://doi.org/10.1145/3290605.3300514

Tracking of Deformable Human Avatars through Fusion of Low-Dimensional 2D and 3D Kinematic Models
We propose a method to estimate and track the 3D posture as well as the 3D shape of the human body from a single RGB-D image. We estimate the full 3D mesh of the body and show that 2D joint positions greatly improve 3D estimation and tracking accuracy. The problem is inherently very challenging because due to the complexity of the human body, lighting, clothing, and occlusion. The solve the problem, we leverage a custom MobileNet implementation of OpenPose CNN to construct a 2D skeletal model of the human body. We then fit a low-dimensional deformable body model called SMPL to the observed point cloud using initialization from the 2D skeletal model…

Author(s): Ningjian Zhou and S. Shankar Sastry
Technical Report No. UCB/EECS-2019-87
Publication Date: May 19, 2019

Temporal IK: Data-Driven Pose Estimation for Virtual Reality
High-quality human avatars are an important part of compelling virtual reality (VR) experiences. Animating an avatar to match the movement of its user, however, is a fundamentally difficult task, as most VR systems only track the user’s head and hands, leaving the rest of the body undetermined. In this report, we introduce Temporal IK, a data-driven approach to predicting full-body poses from standard VR headset and controller inputs. We describe a recurrent neural network that, when given a sequence of positions and rotations from VR tracked objects, predicts the corresponding full-body poses in a manner that exploits the temporal consistency of human motion…

Author(s): James Lin and James O’ Brien
Technical Report No. UCB/EECS-2019-59
Publication Date: May 17, 2019

Real-Time Hand Model Estimation from Depth Images for Wearable Augmented Reality Glasses
This work presents a hand model estimation method designed specifically with augmented reality (AR) glasses and 3D AR interface in mind. The proposed work is capable of estimating the 3D positions of all ten finger from a single depth image. By leveraging a low-dimensional hand model and exploiting hand geometries from an ego-centric view, we build a lightweight algorithm that is accurate, environment agnostic, and runs in real time on mobile hardware. One major consideration in our design for AR is that the user’s hand is likely to interact with planar surfaces since they serve as ideal “touchscreens”…PDF

Author(s): Bill Zhou, Alex Yu, Joseph Menke and Allen Yang
DOI: 10.1109/ISMAR-Adjunct.2019.00-31

A User Experience Study of Locomotion Design in Virtual Reality Between Adult and Minor Users
Virtual reality (VR) is an important new technology that is fundamentally changing the way people experience entertainment and education content. Due to the fact that most currently available VR products are one-size-fits-all, the user experience of the content interface and user interaction for children is not well understood compared to that for adults. In this study, we seek to explore user experience of locomotion in VR between healthy adults and healthy minors along both objective and subjective dimensions…

Author(s): Zhijiong Huang, Yu Zhang, Kathryn C. Quigley, Ramya Sankar, and Allen Yang
DOI: 10.1109/ISMAR-Adjunct.2019.00027

NeurVPS: Neural Vanishing Point Scanning via Conic Convolution
We present a simple yet effective end-to-end trainable deep network with geometry-inspired convolutional operators for detecting vanishing points in images. Traditional convolutional neural networks rely on aggregating edge features and do not have mechanisms to directly exploit the geometric properties of vanishing points as the intersections of parallel lines. In this work, we identify a canonical conic space in which the neural network can effectively compute the global geometric information of vanishing points locally, and we propose a novel operator named conic convolution…

Author(s): Yichao Zhou, Haozhi Qi, and Yi Ma
Conference: NeurIPS, 2019

L-CNN: End-to-End Wireframe Parsing
We present a conceptually simple yet effective algorithm to detect wireframes in a given image. Compared to the previous methods which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable and can directly output a vectorized wireframe that contains semantically meaningful and geometrically salient junctions and lines…

Author(s): Yichao Zhou, Haozhi Qi, and Yi Ma
Conference: International Conference on Computer Vision (ICCV), 2019

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
In this paper, we propose a method to obtain a compact and accurate 3D wireframe representation from a single image by effectively exploiting global structural regularities. Our method trains a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with the state-of-the-art learning-based wireframe detection methods, our network is much simpler and more unified, leading to better 2D wireframe detection…

Author(s): Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, and Yi Ma
Conference: International Conference on Computer Vision (ICCV), 2019