Yichen Xu

An undergraduate student studying computer science at Beijing University of Posts and Telecommunications.
A research intern with major interests lying in the field of machine learning on graphs at CRIPAC, CASIA.
A research intern at LAMP, EPFL, working on the implementation and theory of Scala.
An enthusiastic Haskeller with my love in functional programming, programming language theory, compiler construction and formal methods.

-> Github | Google Scholar | Blog

Machine Learning Graph Representation Learning Programming Language Theory Compilers Functional Programming Hardware Design

Here is my CV. (English | 中文)

Recent updates:

(2021.10.12) The paper An Empirical Study of Graph Contrastive Learning gets accepted to NeurIPS 2021!

(2021.9.1) A new paper Implementing Path-Dependent GADT Reasoning for Scala 3 gets accepted to Scala Symposium '21!

(2021.8.15) One paper gets accepted to CIKM '21!


Bachelor of Computer Science
at Beijing University of Posts and Telecommunications
2018 - 2022, expected
  • Overall GPA: 3.86/4.0, 93.16/100 (ranking 1st379)
  • Awards and honours: National Scholarship (two consecutive years, only top 1% students will be awarded); 1st Prize in NSCSCC 2020 (MIPS CPU Design Contest)

Research Experience

Research Intern
2019.6 - 2021.6

Research interests mainly lie in the field of Machine Learning and Graph Representation Learning. Current research directions include self-supervised training of GNNs and contrastive learning on graphs. I am supervised by Prof. Shu Wu and work with Yanqiao Zhu.

Research Intern
at Programming Methods Laboratory, Ecole Polytechnique Fédérale de Lausanne
2021.2 - current

I work on dotty, the Scala 3 compiler, and I am supervised by Aleksander Boruch-Gruszecki. Specifically, I am trying to improve the implementation of Generalized Algebraic Datatypes (GADTs) in the compiler. I am also working on the theory of GADT reasoning in Dependent Object Types. I have done the soundness proof for an extended variant of the pDOT calculus that allows us to invert the subtyping evidence. The extended calculus is useful for formalizing GADT reasoning.

Research Intern
at IIIS, Tsinghua University
2021.9 - current

I am supervised by Prof. Zhilin Yang, and working on designing and pretraining large-scale language models for code intelligence. My current research interests lie in code generation with pretrained language models like GPT.


Implementing Path-Dependent GADT Reasoning for Scala 3
Yichen Xu, Aleksander Boruch-Gruszecki, Lionel Parreaux
Preprint | DOI | Slides
Scala Symposium 2021
An Empirical Study of Graph Contrastive Learning
Yanqiao Zhu, Yichen Xu, Qiang Liu, Shu Wu
OpenReview | Slides | Poster | Code
NeurIPS 2021
Structure-Aware Hard Negative Mining for Heterogeneous Graph Contrastive Learning
Yanqiao Zhu, Yichen Xu, Hejie Cui, Carl Yang, Qiang Liu, Shu Wu
arXiv | Slides
DLG@KDD 2021
Disentangled Self-Attentive Neural Networks for Click-Through Rate Prediction
Yichen Xu, Yanqiao Zhu, Feng Yu, Qiang Liu, Shu Wu
arXiv | Code
CIKM 2021
Graph Contrastive Learning with Adaptive Augmentation
Yanqiao Zhu, Yichen Xu, Feng Yu, Shu Wu, Liang Wang
arXiv | Code | Slides | Blog
WWW 2021
Deep Graph Contrastive Representation Learning
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, Liang Wang
arXiv | Code | Slides | Video
GRL+ @ ICML Workshop 2020
CAGNN: Cluster-Aware Graph Neural Networks for Unsupervised Graph Representation Learning
Yanqiao Zhu, Yichen Xu, Feng Yu, Shu Wu, Liang Wang
Preprint, in submission to TIST

Selected Projects


A compiler that compiles a subset of Scala to C. Basic functional programming and object-oriented programming functionalities and main language features are included in the subset. Implemented the Hindley-Milner type system for type checking and inferencing.


A battery-included library for graph contrastive learning with PyTorch. It implements a wide variety of contrastive objectives, data augmentations, contrasting modes and other utilities useful for implementing contrastive learning on graphs.


Sircle is an DSL designed for my project CoordML, a tool to bring automation to experimenting in machine learning research. Sircle is used for define experiment tasks with ease. It is interpreted, impure and dynamically typed, and it supports functional programming styles with first-class functions, lambda and currying. More information about Sircle can be found at Introduction to Sircle.


A cute little library for Cats.

  • 1. Implements common algebraic structures in Scala, including Arrow, Semigroup, Functor, Applicative, Monad, Alternative.
  • 2. Provides monad instances for Scala classes like Option, Either and List.
  • 3. Includes common monads (transformers) out-of-box (OptionT, StateT, ReaderT, Parser combinators, Free, ...).
  • 4. Creates interesting toys like a typed tagless final interpreter with de Bruijn indicies.

A full-featured and high-performance MIPS32 cache written in Chisel3. It transfer data via AXI bus in wrap mode. It has a victim cache and supports write buffering, with all its parameters configurable. It is part of a MIPS32 CPU, EasterMIPS, which is the work of our team for the NSCSCC 2020 competition, and we got the First Prize in the contest.


Group Theory Review
This talk reviews group theory taught in my undergraduate course Discrete Mathematics.
Learn Emacs through Org-mode
This talk gives a short introduction to the editor Emacs through the usage of Org-mode as a GTD tool.
A Beginner's Guide to Functional Programming
Handout | Slides (in Chinese)
This talk introduces the basic concepts, usage and theoretical backgrounds of functional programming.
How a Website Works
Slides (in Chinese)
This talk reveals the way a website works. It includes the basic concepts of front-end development tools, website rendering, HTTP protocol and backend technologies.
The Path to Bugless Programs
A First Introduction to Formal Verification
Slides (in Chinese)
This talk gives an introduction to formal verification, including a basic guide for Hoare logic and formal verification tools.