-
Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017)
Authors:
Been Kim,
Dmitry M. Malioutov,
Kush R. Varshney,
Adrian Weller
Abstract:
This is the Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017), which was held in Sydney, Australia, August 10, 2017. Invited speakers were Tony Jebara, Pang Wei Koh, and David Sontag.
This is the Proceedings of the 2017 ICML Workshop on Human Interpretability in Machine Learning (WHI 2017), which was held in Sydney, Australia, August 10, 2017. Invited speakers were Tony Jebara, Pang Wei Koh, and David Sontag.
△ Less
Submitted 8 August, 2017;
originally announced August 2017.
-
Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016)
Authors:
Been Kim,
Dmitry M. Malioutov,
Kush R. Varshney
Abstract:
This is the Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), which was held in New York, NY, June 23, 2016.
Invited speakers were Susan Athey, Rich Caruana, Jacob Feldman, Percy Liang, and Hanna Wallach.
This is the Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), which was held in New York, NY, June 23, 2016.
Invited speakers were Susan Athey, Rich Caruana, Jacob Feldman, Percy Liang, and Hanna Wallach.
△ Less
Submitted 27 July, 2016; v1 submitted 8 July, 2016;
originally announced July 2016.
-
Interpretable Two-level Boolean Rule Learning for Classification
Authors:
Guolong Su,
Dennis Wei,
Kush R. Varshney,
Dmitry M. Malioutov
Abstract:
As a contribution to interpretable machine learning research, we develop a novel optimization framework for learning accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) and disjunctive normal form (OR-of-ANDs). A principled objective function is proposed to trade classification accuracy and interpretability, where we use Hamming loss to chara…
▽ More
As a contribution to interpretable machine learning research, we develop a novel optimization framework for learning accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) and disjunctive normal form (OR-of-ANDs). A principled objective function is proposed to trade classification accuracy and interpretability, where we use Hamming loss to characterize accuracy and sparsity to characterize interpretability. We propose efficient procedures to optimize these objectives based on linear programming (LP) relaxation, block coordinate descent, and alternating minimization. Experiments show that our new algorithms provide very good tradeoffs between accuracy and interpretability.
△ Less
Submitted 18 June, 2016;
originally announced June 2016.
-
Interpretable Two-level Boolean Rule Learning for Classification
Authors:
Guolong Su,
Dennis Wei,
Kush R. Varshney,
Dmitry M. Malioutov
Abstract:
This paper proposes algorithms for learning two-level Boolean rules in Conjunctive Normal Form (CNF, i.e. AND-of-ORs) or Disjunctive Normal Form (DNF, i.e. OR-of-ANDs) as a type of human-interpretable classification model, aiming for a favorable trade-off between the classification accuracy and the simplicity of the rule. Two formulations are proposed. The first is an integer program whose objecti…
▽ More
This paper proposes algorithms for learning two-level Boolean rules in Conjunctive Normal Form (CNF, i.e. AND-of-ORs) or Disjunctive Normal Form (DNF, i.e. OR-of-ANDs) as a type of human-interpretable classification model, aiming for a favorable trade-off between the classification accuracy and the simplicity of the rule. Two formulations are proposed. The first is an integer program whose objective function is a combination of the total number of errors and the total number of features used in the rule. We generalize a previously proposed linear programming (LP) relaxation from one-level to two-level rules. The second formulation replaces the 0-1 classification error with the Hamming distance from the current two-level rule to the closest rule that correctly classifies a sample. Based on this second formulation, block coordinate descent and alternating minimization algorithms are developed. Experiments show that the two-level rules can yield noticeably better performance than one-level rules due to their dramatically larger modeling capacity, and the two algorithms based on the Hamming distance formulation are generally superior to the other two-level rule learning methods in our comparison. A proposed approach to binarize any fractional values in the optimal solutions of LP relaxations is also shown to be effective.
△ Less
Submitted 23 November, 2015;
originally announced November 2015.
-
Lagrangian Relaxation for MAP Estimation in Graphical Models
Authors:
Jason K. Johnson,
Dmitry M. Malioutov,
Alan S. Willsky
Abstract:
We develop a general framework for MAP estimation in discrete and Gaussian graphical models using Lagrangian relaxation techniques. The key idea is to reformulate an intractable estimation problem as one defined on a more tractable graph, but subject to additional constraints. Relaxing these constraints gives a tractable dual problem, one defined by a thin graph, which is then optimized by an it…
▽ More
We develop a general framework for MAP estimation in discrete and Gaussian graphical models using Lagrangian relaxation techniques. The key idea is to reformulate an intractable estimation problem as one defined on a more tractable graph, but subject to additional constraints. Relaxing these constraints gives a tractable dual problem, one defined by a thin graph, which is then optimized by an iterative procedure. When this iterative optimization leads to a consistent estimate, one which also satisfies the constraints, then it corresponds to an optimal MAP estimate of the original model. Otherwise there is a ``duality gap'', and we obtain a bound on the optimal solution. Thus, our approach combines convex optimization with dynamic programming techniques applicable for thin graphs. The popular tree-reweighted max-product (TRMP) method may be seen as solving a particular class of such relaxations, where the intractable graph is relaxed to a set of spanning trees. We also consider relaxations to a set of small induced subgraphs, thin subgraphs (e.g. loops), and a connected tree obtained by ``unwinding'' cycles. In addition, we propose a new class of multiscale relaxations that introduce ``summary'' variables. The potential benefits of such generalizations include: reducing or eliminating the ``duality gap'' in hard problems, reducing the number or Lagrange multipliers in the dual problem, and accelerating convergence of the iterative optimization procedure.
△ Less
Submitted 28 September, 2007;
originally announced October 2007.