# 分类:用于文章主题识别等任务的自然语言处理技术

## 分类方法

### 论文矢量和类矢量同时训练？

Fast text算法可以同时训练被分类的对象的适量表示和类的矢量表示，是不是在论文分类问题上也可以用一下？

## 参考文献

1. K. Boyack, W. Glänzel, J. Gläser, F. Havemann, A. Scharnhorst, B. Thijs, N. van Eck, T. Velden & Ludo Waltmann, Topic identification challenge, Scientometrics (2017) 111: 1223. doi:10.1007/s11192-017-2307-0
2. J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—Different results? Towards a comparative approach to the identification of thematic structures in science. Special Issue of Scientometrics. doi:10.1007/s11192-017-2296-z
3. Boyack, K. W. (2017a). Investigating the effect of global data on topic detection. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2297-y
4. Boyack, K. W. (2017b). Thesaurus-based methods for mapping contents of publication sets. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2304-3
5. Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404
6. Glänzel, W., & Thijs, B. (2017). Using hybrid methods and `core documents’ for the representation of clusters and topics: The astronomy dataset. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2301-6
7. Havemann, F., Gläser, J., & Heinz, M. (2017). Memetic search for overlapping topics based on a local evaluation of link communities. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2302-5
8. Klavans, R., & Boyack, K. W. (2015). Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? http://arxiv.org/abs/1511.05078.
9. Koopman, R., & Wang, S. (2017). Mutual information based labelling and comparing clusters. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2305-2.
10. Koopman, R., Wang, S., & Scharnhorst, A. (2017). Contextualization of topics (extended): Browsing through the universe of bibliographic information. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2303-4
11. Šubelj, L., van Eck, N. J., & Waltman, L. (2016). Clustering scientific publications based on citation relations: A systematic comparison of different methods. PLoS ONE, 11(4), e0154404
12. Van Eck, N. J., & Waltman, L. (2017). Citation-based clustering of publications using CitNetExplorer and VOSviewer. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2300-7
13. Velden, T., Yan, S., & Lagoze, C. (2017b). Mapping the cognitive structure of astrophysics by infomap: Clustering of the citation network and topic affinity analysis. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2299-9.
14. Waltman, L., & van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392
15. Wang, S., & Koopman, R. (2017). Clustering articles based on semantic similarity. In J. Gläser, A. Scharnhorst & W. Glänzel (Eds.), Same data—different results? Towards a comparative approach to the identification of thematic structures in science, Special Issue of Scientometrics. doi:10.1007/s11192-017-2298-x
16. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84
17. Thijs, Bart and Glänzel, Wolfgang and Meyer, Martin S. (2015) Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of “Information Systems Research”. In: Proc. of the Workshop Mining Scientific Papers: Computational Linguistics and Bibliometrics, 15th International Society of Scientometrics and Informetrics Conference (ISSI), Istanbul, Turkey, 29/6/2015, Istanbul
18. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
19. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
20. Le, Q., & Mikolov, T. (2014, January). Distributed representations of sentences and documents. In International Conference on Machine Learning (pp. 1188-1196).
21. Grover, A., & Leskovec, J. (2016, August). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 855-864). ACM.
22. Qiaoyu Tan, Ninghao Liu, Xia Hu, Deep Representation Learning for Social Network Analysis, arXiv:1904.08547