Santa Clara University

The University of Maryland, Baltimore County (UMBC) is a public research university in Baltimore County, Maryland. It has a fall 2022 enrollment of 13,991 students, 61 undergraduate majors, over 92 graduate programs (38 master, 25 doctoral, and 29 graduate certificate programs) and the first university research park in Maryland. It is classified among “R1: Doctoral Universities – Very High Research Activity”. UMBC is located about 45 minutes from Washington, D.C.

  • Department: Department of Information Systems/SONG Lab

  • E-mail: songh@umbc.edu

  • We are a public sector organization and we plan to involve industry partners

  • Looking for: Researcher (humanities and social sciences),Researcher (multidisciplinary),Researcher (scientific/technical/engineering)
  • Track: Challenges,Open Ideas,Paired Teams
  • Preferred hosting duration: No preference

  • Maximum number of fellows to be hosted by the organization simultaneously : 6

  • 5 Challenges:

    • Challenge 1, Neurosymbolic AI:
      • To successfully incorporate autonomous systems into their missions, operators must have confidence that those systems will operate safely and perform as intended.
        SONG Lab is motivating new thinking and approaches to artificial intelligence development to enable high levels of trust in autonomous systems.
        SONG Lab seeks breakthrough innovations in the form of new, hybrid AI algorithms that integrate symbolic reasoning with data-driven learning to create robust, assured, and therefore trustworthy systems.
    • Challenge 2, Data-Efficient Machine Learning:
      • Many recent efforts in machine learning have focused on learning from massive amounts of data resulting in large advancements in machine learning capabilities and applications. However, many domains lack access to the large, high-quality, supervised data that is required and therefore are unable to fully take advantage of these data-intense learning techniques. This necessitates new data-efficient learning techniques that can learn in complex domains without the need for large quantities of supervised data. This topic focuses on the investigation and development of data-efficient machine learning methods that are able to leverage knowledge from external/existing data sources, exploit the structure of unsupervised data, and combine the tasks of efficiently obtaining labels and training a supervised model. Areas of interest include, but are not limited to: Active learning, Semi-supervised leaning, Learning from “weak” labels/supervision, One/Zero-shot learning, Transfer learning/domain adaptation, Generative (Adversarial) Models, as well as methods that exploit structural or domain knowledge. Furthermore, while fundamental machine learning work is of interest, so are principled data-efficient applications in, but not limited to: Computer vision (image/video categorization, object detection, visual question answering, etc.), Social and computational networks and time-series analysis, and Recommender systems.
    • Challenge 3, Verifiable Reinforcement Learning:
      • Reinforcement learning and sequential decision-making have been revolutionized in recent years thanks to advancements in deep neural networks. One of the most recent breakthroughs was accomplished by the AlphaGo system and its victory over the world Go champion. However, even in this impressive system, the learned agent performed sub-optimal actions that puzzled both the Go and the reinforcement learning communities. Such failures in decision-making motivate the need for methods that can provide (statistical) guarantees on the actions performed by an agent. We are interested in establishing such guarantees in both discrete and continuous systems where agents learn policies, or action plans, through experience by interacting with their environment. To this end, we seek specialists from areas such as optimal control, game theory, hybrid automata, formal methods, machine learning, and multi-objective optimization. Some problems of interest in this domain include, but are not limited to the following:
        • Decision-making in partially observable Markov Decision Processes.
        • Satisfying probabilistic guarantees on the behavior of a learned agent when approximate value functions (i.e. neural networks) are used to measure utility.
        • Control of hybrid systems resulting from the discretization of continuous space induced by a given set of behavioral specifications. Such specifications are typically defined by a temporal logic such as computation tree logic and linear temporal logic.
        • Decision-making in adversarial stochastic games.
        • Reinforcement learning as a constrained optimization problem wherein expected long-term rewards are to be maximized while satisfying bounds on the probabilities of satisfying various behavioral specifications.
    • Challenge 4, Mathematical Methods for Deep Learning:
      • Background: Deep learning (DL) has emerged as a powerful approach for knowledge acquisition, and analysis and exploitation of large datasets. In particular, it has made substantial advances in visual object classification and speech recognition. Innovative DL architectures for specific applications in artificial intelligence and autonomy continue to be developed at a fast pace. DL is also making inroads into other disciplines such as the physical and biological sciences for developing better predictive models and potential discovery of important phenomena from the data. However, mathematical tools for understanding the many successes of DL do not yet exist. For example, it is not clear what classes of concepts can be learned by a deep network; there is no principled way of designing the DL architecture; and why stochastic gradient descent (SGD), a seemingly simple optimization strategy, is so effective in finding the global or a good local minimum of the high-dimensional, non-convex cost function. More importantly, DL is also a major technology intended for decision-making in safety critical applications. However, since the current approach to verification of DL is based on purely empirical test and evaluation using annotated datasets, it becomes impractical to obtain a reliable performance envelope. Moreover, empirical verification is impossible for a deployed agent that encounters conditions it was not trained for, or for an unsupervised self-learning agent because the data is not annotated and it is not even clear what to test the agent on. Therefore, mathematical tools are needed for analyzing what the agent has learned and what the confidence level is, for controlling and improving its learning, and for predicting its performance under various conditions. Such tools, will play an essential role in developing analytical and formal methods for verification and validation of the agent, and will lead to advances for better explanations of its decisions and building trust in its behavior. DL is a challenging complex nonlinear system. However, certain promising early advances in the mathematics of DL such as connections between PDEs and DL, methods for growing DL architectures, and formulating beneficial cost functions, indicate that principled, sound, and systematic methods for the design and analysis of DL could be achieved.
        Objective: Develop mathematical, formal, and optimization methods for designing Deep Learning architectures, training them, analyzing their performance, and predicting their behavior under different conditions with provable guarantees. These methods should be applicable to various models including neural networks, compositional networks, and-or graphs, and similar feedforward and recurrent learning and inference architectures.