Ph.D. Student, KTH Royal Institute of Technology

Ruiyu Wang

I study learning-based control and develop data-efficient, generalizable visual representations for embodied systems.

2023 - Present Ph.D. in Computer Science
Core Area Robot Learning
Interests Vision, Perception, Generalization

About

Robotics, vision, and learning.

I am a Ph.D. student in computer science at KTH, advised by Prof. Florian T. Pokorny. My research is funded by the CloudGripper project of WASP. Before joining KTH, I received an M.Sc. in quantitative finance from the National University of Singapore and a B.Sc. in physics from Peking University.

My research focuses on learning-based perception and control for embodied agents in the physical world. I study where and how agents derive state information from visual sensory inputs to support action prediction. A central goal of my work is to improve the data efficiency and out-of-domain generalization of visual representations for robotic decision-making.

More broadly, I am also interested in VLM/VLA reasoning and steering.

My curriculum vitae.

Research

Keywords and publications.

View full publication list

Click a word in the word cloud to view related publications.

Featured publications

NPPC benchmark preview
TMLR 2026 Reasoning

Nondeterministic Polynomial-time Problem Challenge: An Ever-Scaling Reasoning Benchmark for LLMs

Chang Yang*, Ruiyu Wang*, Junzhe Jiang, Qi Jiang, Qinggang Zhang, Yanchen Deng, Shuxin Li, Shuyue Hu, Bo Li, Florian T. Pokorny, Xiao Huang, and Xinrun Wang

PALM perception alignment preview
RA-L 2026 Generalization

PALM: Enhanced Generalizability for Local Visuomotor Policies via Perception Alignment

Ruiyu Wang*, Zheyu Zhuang*, Danica Kragic and Florian T. Pokorny

PACA scene rearrangement preview
WACV 2025 Zero-Shot

PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement

Shutong Jin*, Ruiyu Wang*, Kuangyi Chen and Florian T. Pokorny

Double Oracle Neural Architecture Search preview
TIP 2025 RL

Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models

Aye Phyu Phyu Aung, Xinrun Wang, Ruiyu Wang, Hau Chan, Bo An, Xiaoli Li and J. Senthilnath

Visual encoder policy preview
ICRA 2025 Vision

Feature Extractor or Policy Learner: Rethinking the Role of Visual Encoders in Visuomotor Policies

Ruiyu Wang, Zheyu Zhuang, Shutong Jin, Nils Ingelhag, Danica Kragic and Florian T. Pokorny

MirrorDuo mirrored demonstration preview
CoRL 2025 Data Efficiency

MirrorDuo: Reflection-Consistent Visuomotor Learning from Mirrored Demonstration Pairs

Zheyu Zhuang*, Ruiyu Wang*, Giovanni Luca Marchetti, Florian T. Pokorny and Danica Kragic

RealCraft video editing preview
ICONIP 2025 Zero-Shot

RealCraft: Attention Control as A Tool for Zero-Shot Consistent Video Editing

Shutong Jin, Ruiyu Wang and Florian T. Pokorny

SaGA augmentation preview
CoRL 2024 Generalization

Enhancing Visual Domain Robustness in Behaviour Cloning via Saliency-Guided Augmentation

Zheyu Zhuang, Ruiyu Wang, Nils Ingelhag, Ville Kyrki and Danica Kragic

Planar pushing dataset preview
IROS 2024 Vision

How Physics and Background Attributes Impact Video Transformers in Robotic Manipulation: A Case Study on Planar Pushing

Shutong Jin, Ruiyu Wang, Muhammad Zahid and Florian T. Pokorny

Education

Educational pathway.

Service

Academic activities.

Reviewer

IEEE T-RO (2 times), CoRL (2025, 2026), ICRA (2026), IROS (2025, 2026)

Contact

Get in touch.