An Indiana University computer scientist has received a $1.4 million grant from a U.S. Department of Defense agency. Ken Shan will use the funding on research designed to make machine learning technology applicable to different types of data.
March 25, 2014
Bloomington, Ind. — By understanding, managing and inferring patterns from data, machine learning has brought us self-driving vehicles, spam filters and smartphone personal assistants. Now an Indiana University Bloomington computer scientist has received $1.4 million to give machine learning more muscle by making it applicable to greater amounts of more diverse data.
Chung-chieh “Ken” Shan, an assistant professor in the School of Informatics and Computing, will receive the funding from the U.S. Defense Department's Defense Advanced Research Projects Agency over 46 months. The work will focus on probabilistic programming, a relatively new programming paradigm for managing uncertain information.
Currently, most probabilistic problems require expensive programs custom-crafted by hard-to-find experts. These programs remain painfully slow with unpredictable performance when tackling large, complex data sets.
Shan's charge is to develop a new probabilistic programming system that would let more people build machine learning applications, and to make those experts more effective at creating powerful applications that need less data to produce accurate results.
“Building probabilistic systems today is error-prone and requires painstaking manual effort,” said Shan, who came to IU in 2013 with a Bachelor of Arts in mathematics and a Ph.D. in computer science from Harvard University. “After that substantial time from a programmer, it can then take many computers considerable time to compute a subtly incorrect result.”
Manual implementation is required because the detailed reasoning needed to orchestrate and monitor inference techniques used in making decisions has not been mechanized, he said. Rather, programmers using machine learning preprocess the model input and post-process the inference output by hand.
“That expectation just does not scale to a greater variety of data and users, or to tuning for diverse hardware,” Shan said.
DARPA has recognized that demand for these capabilities is accelerating; yet every new application still requires a Herculean effort. So, if integrated models can be constructed across a wide variety of domains and tool types, the new systems could help revolutionize machine learning capabilities in fields including intelligence, natural language processing, predictive analytics and cybersecurity.
One group of researchers has already found that probabilistic computer modeling can interpret verbal autopsy data faster and cheaper than physician reviews. Another group is using machine learning to develop “smart drugs” that can automatically detect, diagnose and treat a variety of diseases using a cocktail of chemicals.
Shan said the primary challenges are to develop the theoretical underpinnings that would allow programming to move from a case-by-case, ad hoc basis to mechanization, and to build a system that offers efficient execution.
“We want a system that automatically orchestrates and monitors parallel, online and multi-method inference,” he said. “And the key technique for both efficiency and mechanization is a symbolic and executable representation of models and algorithms that integrates various inference techniques and summarizes the symmetries inherent in large data sets.”
If successful, the new systems would meet DARPA's five primary objectives for advancing machine learning: Shorten code to make models faster and easier to understand; reduce development time and cost to encourage experimentation; facilitate construction of more sophisticated models that use rich domain knowledge and separate queries from underlying code; reduce the level of expertise needed to build applications; and support construction of integrated models across a wide variety of domains and tool types.
After receiving his Ph.D. in 2005, Shan spent six years at Rutgers University, a semester at Cornell University and a semester at Japan's University of Tsukuba before coming to IU Bloomington last year.
IU Bloomington is the flagship residential, research-intensive campus of Indiana University. Its academic excellence is grounded in the humanities, arts and sciences, and a range of highly ranked professional programs. Founded in 1820, the campus serves more than 42,000 undergraduate and graduate students pursuing degrees in more than 300 disciplines. Widely recognized for its global and international programs, outstanding technology and historic limestone campus, IU Bloomington serves as a global gateway for students and faculty members pursuing issues of worldwide significance.
Source: Indiana University