## 主题：面向复杂系统的人工智能

“网络、几何与机器学习”研读营是由集智俱乐部主办，凯风基金会资助的“面向复杂系统的人工智能研究”系列活动的第三期。我们计划将于2019年8月举行的为期5天的前沿文献研读、讨论的活动，主题范围涵盖：复杂网络、统计物理、量子物理与机器学习。其目的是为了从这些前沿科学领域获得新的研究灵感以及促进集智科学家成员之间的彼此互动、交流，从而孕育全新的科研思想。

## 参加人员

• 张江，北京师范大学系统科学学院教授，集智俱乐部创始人、集智AI学园创始人，研究兴趣包括：复杂系统、图网络、集体注意力流、流网络、异速生长律等。
• 张潘，中国科学院理论物理研究所副研究员，集智科学家，研究方向为统计物理与复杂系统，具体来说是用统计物理中的一些理论如自旋玻璃理论，副本对称破缺理论研究应用数学，网络和计算机科学中的一些问题。张潘的研究兴趣广泛，涉及物理，统计，网络和机器学习的很多方面并且在不断地拓宽自己的研究领域。
• 尤亦庄，加州大学圣地亚哥分校物理系助理教授，集智科学家，主要研究领域是量子多体物理，关注集体行为导致的演生现象和临界现象。对信息论（特别是量子信息），复杂系统，人工智能等领域也很感兴趣。
• 吴令飞，匹兹堡大学计算与信息学院助理教授，集智俱乐部核心成员、集智科学家。研究兴趣：思想的几何（the geometry of thinking）及其在人类知识与技能的优化组合上的应用，包括科学学（science of science）,技能科学（science of skills），团队科学（science of teams）,未来工作（future of work）等方向。

• 时间：待定
• 地点：待定

## 研讨主题

• 复杂系统自动建模
• 因果推断方法
• 技能、职业与社会分工的计算社会学

...

### 复杂系统的自动建模

• Alvaro Sanchez-Gonzalez,Nicolas Heess,Jost Tobias Springenberg.et al.: Graph networks as learnable physics engines for inference and control ,arxiv,2018

• Thomas Kipf,Ethan Fetaya,Kuan-Chieh Wang.et al.: Neural Relational Inference for Interacting Systems ,arXiv:1802.04687, 2018.

• Seungwoong Ha,Hawoong Jeong: Towards Automated Statistical Physics : Data-driven Modeling of Complex Systems with Deep Learning ,arxiv,2020

• Danilo Jimenez Rezende Shakir Mohamed: Variational Inference with Normalizing Flows, arXiv:1505.05770v6

• Fan Yang†, Ling Chen∗†, Fan Zhou†, Yusong Gao‡, Wei Cao：RELATIONAL STATE-SPACE MODEL FOR STOCHASTIC MULTI-OBJECT SYSTEMS, arXiv:2001.04050v1

• Ricky T. Q. Chen*, Yulia Rubanova*, Jesse Bettencourt*, David Duvenaud: Neural Ordinary Differential Equations, arXiv:1806.07366v5

• Michael John Lingelbach, Damian Mrowca, Nick Haber, Li Fei-Fei, and Daniel L. K. Yamins: TOWARDS CURIOSITY-DRIVEN LEARNING OF PHYSICAL DYNAMICS, “Bridging AI and Cognitive Science” (ICLR 2020)

• Chengxi Zang and Fei Wang: Neural Dynamics on Complex Networks, AAAI 2020

AAAI 2020的best paper，将Neural ODE与图网络结合针对复杂网络的一般的动力学西问题，利用最优控制原理进行求解。该文还将半监督节点分类问题也转化为最优控制问题，从而取得了显著的效果。

### Word2Vec内蕴的几何

(Yiling, hauchuang, lingfei 共同编辑)

#### Orders of associations (1st and 2nd orders)

Word2vec is dual-embeddings. Each word will have two embeddings (vector representations), including term embedding and context embedding. There are two frameworks, CBOW (many-to-one) and Skip-gram Negative Sampling (SGNS, one-to-many). For CBOW, term and context embeddings correspond to IN and OUT matrices, and it is reversed from SGNS.

1. Nalisnick, E., Mitra, B., Craswell, N., & Caruana, R. (2016, April). Improving document ranking with dual word embeddings. In Proceedings of the 25th International Conference Companion on World Wide Web (pp. 83-84).

This paper presents how IN-IN and IN-OUT vector cosine similarities models collocative and substitutive word pairs.

The In-Out dot product in SKNS implicitly models PPMI, what does In-In model? Is it Jason Shannon entropy between the PPMI rows (or columns) ?

#### Diffusion

word2vec is a diffusion model. This explains why it predicts the diffusion of collective attention in search of scientific knowledge. Base on the duality between dynamics on networks (Newtonian) and geometry of networks(Einsteinian), we can assume that for all network diffusion models with PMI on edges as "geo-distance", we can develop their word2vec/representative learning/manifold learning versions (actually this paper defines "effective distance" in a similar way as PMI).

(Brockmann和Helbing提出The Hidden Geometry of Complex, Network-Driven Contagion Phenomena ，https://science.sciencemag.org/content/342/6164/1337/)， 传统疾病传播研究往往以地理距离来建模疾病传播。然而地理距离不是预测疾病传播的有效测度，通勤距离（人流量的反比）才是。一旦我们以通勤距离重新排列城市，疾病传播范围变成一个通勤距离的线性函数。

Tshitoyan et al, Unsupervised word embeddings capture latent knowledge from materials science literature,https://www.nature.com/articles/s41586-019-1335-8, 利用word2vec嵌入捕获了复杂的材料科学概念，例如元素周期表的基础结构以及材料中的结构-特性关系，可以有效地预测材料化学知识空间的演化。

#### Othoganality

sparse coding 稀疏表示

$\mathcal{L}_{\text{sc}} = \underbrace{||WH - X||_2^2}_{\text{reconstruction term}} + \underbrace{\lambda ||H||_1}_{\text{sparsity term}}$

Sparse coding is a representation learning method that aims at finding a sparse representation of the input data (also known as sparse coding) in the form of a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and they compose a dictionary. Atoms in the dictionary are not required to be orthogonal, and they may be an over-complete spanning set. This problem setup also allows the dimensionality of the signals being represented to be higher than the one of the signals being observed. The above two properties lead to having seemingly redundant atoms that allow multiple representations of the same signal but also provide an improvement in sparsity and flexibility of the representation.

Arora, S., Li, Y., Liang, Y., Ma, T., & Risteski, A. (2018). Linear algebraic structure of word senses, with applications to polysemy. Transactions of the Association for Computational Linguistics, 6, 483-495. This paper shows that multiple word senses reside in linear superposition within the word embedding and simple sparse coding can recover vectors that approximately capture the senses. ，A novel aspect of the technique is that each extracted word sense is accompanied by one of about 2000 “discourse atoms” that gives a succinct description of which other words co-occur with that word sense. Discourse atoms can be of independent interest, and make the method potentially more useful. Empirical tests are used to verify and support the theory.

Honglak Lee et al, Efficient sparse coding algorithms,

### 2020.05.29 第二次讨论

• 1、确定研读营的时间为8月份
• 2、流程：先举办读书会，研讨相关主题，后续研读营和公开活动具体形式暂未确认
• 3、确定研读营的主题（类似面向复杂系统的人工智能），里面的子主题可包括：动力学自动学习，因果推断，NLP相关技术等，可以都先列出来，后续归纳主题。

• 1、思考根据老师的算力和人力的需求，如何更好的使用研读营的钱促进科研的进展，提升科研效率。
• 2、请各位老师将推荐的论文更新到wiki上（论文过于专业，希望老师能够简单写一段200-300字左右的简要介绍，方便筹备宣传）