1. Singh S P, Jaakkola T, Jordan M I. Reinforcement learning with soft state aggregation. Advance in Neural Information Processing Systems 7: Proceedings of the Neural Information Processing Systems Conference (NIPS’94), Nov 28-Dec 1, 1994, Denver, CO, USA. Cambridge, MA, USA: MIT Press, 1995: 361-368
2. Tsitsiklis J N, Van Roy B. An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 1997, 42(5): 674-690
3. Dietterich T G. Hierarchical reinforcement learning with the max Q value function decomposition. Journal of Artificial Intelligence Research, 2000, 13: 227-303
4. Parr R. Hierarchical control and learning for Markov decision processes. PhD Thesis. Berkeley, CA, USA: University of California, Berkeley, 1998
5. Simsek Ö, Wolfe P A, Barto A G. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the 22nd International Conference on Machine Learning (ICML’05), Aug 7-10, 2005. Bonn, Germany. New York, NY, USA: ACM, 2005: 816-823
6. Sutton R S, Precup D, Singh S. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 1999, 112(1/2): 181-211
7. Digney B L. Learning hierarchical control structure for multiple tasks and changing environments. From Animals to Animats 5: Proceedings of the 5th International Conference on Simulation of Adaptive Behavior (SAB’98). Aug 17-21, 1998, Zurich, Switzerland. Cambridge, MA, USA: MIT Press, 1998: 321-330
8. Mcgovern A, Barto A G. Automatic discovery of subgoals in reinforcement learning using diverse density. Proceedings of the 18th International Conference on Machine Learning (ICML’01), Jun 28-Jul 1, Williamstown, MA, USA. San Francisco, CA, USA: Morgan Kaufmann, 2001: 361-368
9. Stolle M, Precup D. Learning options in reinforcement learning. Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation (SARA’02), Aug 2-4, Kananaskis, Canada. Berlin, Germany: Springer, 2002: 212-223
10. Asadi M, Huber M. Autonomous subgoal discovery and hierarchical abstraction for reinforcement learning using Monte Carlo method. Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference (AAAI’05), Jul 9-13, 2005, Pittsburgh, PA, USA. Cambridge, MA, USA: MIT Press, 2005: 1588-1589
11. Goel S, Huber M. Subgoal discovery for hierarchical reinforcement learning using learnt policies. Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference (FLAIRS’03), May 12-14, 2003, St Augustine, FL, USA. 2003: 346-350
12. Mannor S, Menache I, Hoze I, et al. Dynamic abstraction in reinforcement learning via clustering. Proceedings of the 21st International Conference on Machine Learning (ICML’04), Jul 4-8, 2004, Banff, Canada. San Francisco, CA, USA: Morgan Kaufmann, 2004: 560-567
13. Menache I, Mannor S, Shimkin N. Q-cut-dynamic discovery of subgoals in reinforcement learning. Proceedings of the 13th European Conference on Machine Learning (ECML’02), Aug 19-23, 2002, Helsinki, Finland. Berlin, Germany: Springer, 2002: 295-306
15. Simsek Ö, Barto A G. Skill characterization based on betweenness. Advances in Neural Information Processing Systems 21: Proceedings of the 22 Annual Conference on Neural Information Processing Systems (NIPS’09), Dec 8-11, 2008, Vancouver, Canada. Cambridge, MA, USA: MIT Press, 2009: 1497-1504
16. Entezari N, Shiri M E, Moradi P. Subgoal discovery in reinforcement learning using local graph clustering. International Journal of Future Generation Communication and Networking, 2011,4(3): 13-23
17. He R J, Brunskill E, Roy N. PUMA: Planning under uncertainty with macro-actions. Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10), Jul 11-15, 2010, Atlanta, GA, USA. Cambridge, MA, USA: MIT Press, 2010: 1089-1096
18. Konidaris G, Barto A. Efficient skill learning using abstraction selection. Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI’09}, Jul 11-17, 2009, Pasadena, CA, USA. 2009: 1107-1113
19. Wang B N, Gao Y, Chen Z Q, et al. K-cluster subgoal discovery algorithm for option. Journal of Computer Research and Development, 2006, 42(5): 851-855 ( in Chinese)
20. Sutton R S, Barto A G. Reinforcement learning: An introduction. Cambridge, MA, USA: MIT Press, 1998
21. Precup D. Temporal abstraction in reinforcement learning. Ph. D Thesis. Amherst, MA, USA: University of Massachusetts, 2000 |