This is the most critical part of using a planner where model of the domain is formalised in terms of predicates and actions, and also that of the problem(s) which the planner will be asked to solve. Despite an increase in tools for knowledge engineering, models are still most commonly built by hand. Model acquisition, debugging and maintenance can become frustration points in practice for non-planning developers. They also represent gaps needing impetus for research.
The ability to see generated plans and understand its actions and their role in the plan, can go a long way in user’s acceptance of planning. Developers are used to visual tools from software development environments and they demand the same from planners. Unfortunately, plan visualization has not received much research interest.
Benchmarking value of planning
Any new component in an application has to prove its value and the same is true for a planner. It should be easy for developers to see the impact of planning. Therefore, guidance on experimental setup, and relevant quantitative and qualitative evaluation is valuable.
Plan execution and monitoring
The output of planners can be executed by people or by computing systems. Further, if plans fail, automated replanning methods may be needed to recover from situation and try alternatives. Furthermore, if planning-enabled applications will execute in an environment shared with humans, latter should be able to understand failure and manually re-plan if needed.
Any special features
Occasionally, applications require new features from planners for the overall system to operate effectively. As users, developers would prefer an ability to enhance a planner themselves to support what they may need, and software styles like plug-ins support them.
Planning for Machine Learning Pipelines
The recent rapid growth in business applications of Big Data, machine learning and data science has generated an increasing interest in simplifying the routine steps of data analysis. A wide variety of feature extraction, dimensionality reduction and classification methods are available today, and their implementations can vary between multiple platforms (i.e., Weka, scikit-learn, R, etc.). Furthermore, each of the elements of the pipeline can be instantiated with different parameters. As a result, data scientists evaluate only a small number of analytic pipelines when analyzing a dataset.
Systems that can automatically compose, run and evaluate analytic pipelines systematically and across platforms can significantly improve over the manual process. These systems for pipeline composition must have the following properties:
1. Completeness: The system should evaluate all possible pipelines.
Figure 1: Planning For Machine Learning Pipelines
2. Correctness: Compositions must be correct, ensuring each composed pipeline is a valid program.
3. Cross-Platform Composition: The system must be able to generate, deploy and execute pipelines across multiple platforms.
4. Extensibility: New pipeline elements and platforms must be easy to integrate into the system, without significant development or engineering efforts.
5. Encapsulation: Most changes to a pipeline element should not require changes to other elements.
As shown by a significant body of work on planning-based web service composition in early 2000, planners can be used to compose web services. Further we will discuss how similar techniques can be adapted for the composition machine learning and data science pipelines.
The Planning Approach
Biem and others (Biem et al. 2015) have developed a planning-based approach for the composition of data science and machine learning pipelines (see Figure 1). This system had introduced the rigorous semantically-guided metalearning and parameter selection approach, which relies on a planning system (MARIO) for pipeline composition.
Planning domains are generated from the descriptions of pipeline elements in Cascade, a language developed specifically for this purpose. Cascade provides the following language primitives for describing the problem:
1. Abstract components, which provide a hierarchy of roles pipeline elements can play within the pipelines (e.g., Classifier abstract component can have an SVM abstract component deriving from it);
2. Concrete components, describing platform-specific implementations of abstract components (e.g., Weka Lib SVM implementation);
3. Patterns, which allow describing a high-level flow of abstract components interconnected via data inputs and outputs, for example requiring that Feature Selection component should be followed by Classifier, followed by Presentation;
Cascade descriptions of the elements of the pipelines can be mapped, for example, to HTN (Sohrabi et al. 2013), or MARIO-specific SPPL language. MARIO provides efficient pipeline composition and execution on a variety of platforms, including Weka, SPSS, R, and Apache Spark. The code that drives execution on these platforms is generated by MARIO by composing together code fragments included in concrete component descriptions in Cascade. This composition of code fragments treats code as text, without parsing the language structure, allowing MARIO to support a wide variety of target languages with minimal macro substitution used to inject parameter values and directory paths.
Licensing. A specialized planner was developed for MARIO, which both provided better control over performance on planning tasks of interest and removed the need to deal with licensing restrictions of existing deterministic planners for business use.
Architecture and Tooling. A web-based UI to test compositions was made available to developers in MARIO, and a REST API (Application Programming Interface) was provided for integration with the learning subsystem.
Knowledge engineering. Specialized Eclipse-based IDE was developed for Cascade language, to make it easier to add new pipeline elements. More than 200 component element descriptions in Cascade have been added by the developers using this tooling.
Visualization. While the pipeline compositions were visualized graphically in developer’s web UI of MARIO, the developers have relied on generated code for more advanced debugging of Cascade components and patterns.
Benchmarking value. The system was designed to replace a manual process, and it performed demonstrably well as such. Experiments performed were end-to-end, not only evaluating the planning component, but also system performance overall, measured by quality of the composed analytic pipelines and the speed of the system in doing that. The approach showed a significant performance improvement over the manual approach, driving wide adoption of the system in the organization.
Plan execution and monitoring. Plan execution in MARIO is accomplished by mapping plans to code of the pipeline, and executing the code on the corresponding platforms. Because a central component is responsible for all code generation, it is able to detect and take advantage of reusable pipeline elements. The elements are reusable when the same element is applied to the same data in multiple pipelines as part of model selection. Monitoring involves detecting completion of the pipeline (or execution errors) and routing the results back to learning subsystem. In that process, saving the intermediate results of processing throughout the pipeline proved valuable both for debugging and for presenting the selected pipeline to the end-users.
Planner consumability. The specialized planner was tuned over several years for optimal performance on these tasks. In one case, a new search algorithm was added to address specific requirements (efficient composable pipeline enumeration).
Other considerations. In this application, value of planning was determined by overall value the system was providing by composing and selecting the best machine learning pipeline for a given task on a given data set, as quickly as possible. Integration with a wide variety of systems for plan execution and providing the best possible experience to developers providing domain knowledge, were key to the successful adoption of planning technologies.
Multi-Modal Journey Planning
Computing a multi-modal itinerary is an important application of AI planning. For example, a trip in a city could involve different transport modes, such as public transport, driving, cycling, and walking. In real life, transportation networks can feature uncertainty in aspects such as the arrival and the departure times of buses, the availability of parking, the duration of a driving trip segment (leg) and others.
Today, the de facto standard is to perform deterministic journey planning. Yet, in the presence of uncertainty, predicting whether some planned connections will succeed cannot always be performed accurately. When an action, such as catching a connection, can succeed with a given probability, deterministic planning must rely on a simplifying assumption. An optimistic assumption is that the connection can be performed for sure. The disadvantage of such an assumption is that, if the connection is missed in reality, the plan becomes invalidated. Re-planning is useful only if some good alternatives happen to exist in the state where the plan gets broken.
A pessimistic assumption is that the connection will always be missed. Plans computed in this manner are safer. However, their disadvantage is a lack of being opportunistic. Deterministic planning cannot distinguish between two trajectories with the same deterministic cost, even if one trajectory could feature some probabilistic opportunities, such as faster connections than the planned ones. Planning under uncertainty can provide policies (contingent plans) that are both opportunistic (including good actions that can succeed with a given probability) and safe (including good alternatives when preferred but uncertain actions fail).
The Planning Approach
Botea et al. (2013) introduce Dija, a search-based system for multi-modal journey planning in the presence of uncertainty. The search space is an AND/OR state space where part of the transitions (actions) can have non-deterministic effects. The action of boarding a public transport vehicle can succeed or fail, depending on the stochastic arrival time of the traveler at the stop, and the stochastic departure time of the vehicle.
The search is based on the AO* algorithm, and it implements heuristic functions and pruning techniques. Given a destination, for important locations on the map, such as public transport stops, admissible estimations for the travel time to the destination are pre-computed and stored as a lookup table. Similar lookup tables are populated with admissible estimates of other metrics, such as the number of legs to the destination.
The estimates on the travel time and the number of legs are used as an admissible heuristic in AO*. Admissible lookup tables can further be used for pruning. Imagine that the number of legs up to a current state in the search, plus an admissible estimation of the number of legs needed from the current state to the destination, exceed a maximum acceptable value set by the user. The state at hand can be pruned away. A similar pruning can be performed based on the cycling time and the walking time. Other types of pruning include pruning based on state dominance, and a domain specific technique that prunes away walking actions (Botea 2016).