About^top

The RecSys 2025 Challenge will be organized by Jacek Dąbrowski, Maria Janicka, Łukasz Sienkiewicz and Gergo Stomfai (Synerise), Dietmar Jannach (University of Klagenfurt, Austria), Marco Polignano ( University of Bari Aldo Moro, Italy), Claudio Pomo (Politecnico di Bari, Italy), Abhishek Srivastava (IIM Visakhapatnam, India), and Francesco Barile (Maastricht University, Netheralnds).

The challenge is designed to promote a unified approach to behavior modeling. Many modern enterprises rely on machine learning and predictive analytics for improved business decisions. Common predictive tasks in such organizations include recommendation, propensity prediction, churn prediction, user lifetime value prediction, and many others. A central piece of information that is used for these predictive tasks are logs of past behavior of users e.g., what they bought, what they added to their shopping cart, which pages they visited. Rather than treating these tasks as separate problems, we propose a unified modeling approach.

To achieve this, we introduce the concept of Universal Behavioral Profiles—user representations that encode essential aspects of each individual’s past interactions. These profiles are designed to be universally applicable across multiple predictive tasks, such as churn prediction and product recommendations. By developing representations that capture fundamental patterns in user behavior, we enable models to generalize effectively across different applications.

Challenge Task^top

The objective of this challenge is to develop Universal Behavioral Profiles based on the provided data, which includes various types of events such as product buy, add to cart, remove from cart, page visit, and search query. These user representations will be evaluated based on their ability to generalize across a range of predictive tasks. The task of the challenge participants is to submit user representations, which will serve as inputs to a simple neural network architecture. Based on the submitted representations, models will be trained on several tasks, including some that are disclosed to participants, called "open tasks," as well as additional hidden tasks. The final performance score will aggregate results from all tasks. We iterate model training and evaluation automatically upon submission, the task of the participants is to submit universal user representations.

Open Tasks:

Churn Prediction: Using the submitted user representations, the model will predict whether an active user (defined as one with at least one product-buy event) will churn, i.e., make no purchases within the next 14 days.
Product Propensity: Using the submitted user representations, the model will predict products which a user is most likely to purchase within the next 14 days, from a predefined subset of items.
Category Propensity: Using the submitted user representations, the model will predict the categories in which a user is most likely to make a purchase within the next 14 days, from a predefined subset of categories.

Hidden Tasks:

In addition to the open tasks, the challenge includes hidden tasks, which remain undisclosed during the competition. The purpose of these tasks is to ensure that submitted Universal Behavioral Profiles are capable of generalization rather than being fine-tuned for specific known objectives. Similar to the open tasks, the hidden tasks focus on predicting user behavior based on the submitted representations, but they introduce new contexts that participants are not explicitly optimizing for. After the competition concludes, the hidden tasks will be disclosed along with the corresponding code, allowing participants to replicate results.

Evaluation^top

The primary metric by which we measure model performance is AUROC. Additionally, the performance of category propensity and product propensity models is evaluated based on the novelty and diversity of the results. In these cases, the task’s score is derived as a weighted sum of all metrics, specifically 0.8 × AUROC + 0.1 × Novelty + 0.1 × Diversity.

For each task, a leaderboard is created based on the respective task scores. The final score, which evaluates the overall quality of user representations and their ability to generalize, is determined by aggregating ranks from all per-task leaderboards using Borda count method. In this approach, each model's rank in a task leaderboard is converted into points, where a model ranked k-th among N participants receives (N - k) points. The final ranking is based on the total points accumulated across all tasks, ensuring that models performing well consistently across multiple tasks achieve a higher overall score.

DataSet^top

The challenge organizers will publish an anonymized dataset containing real-world user interaction logs. All recorded interactions can be utilized to create Universal Behavioral Profiles; however, participants will be required to submit behavioral profiles only for a subset of users, which will be used for model training and evaluation.

The data will consist of four types of events and product attributes:

product_buy

client_id (int64): Numeric ID of the client (user).
timestamp (object): Date and time of the event in the format YYYY-MM-DD HH:mm:ss.
sku (int64): Numeric ID of the item.

add_to_cart

client_id (int64): Numeric ID of the client (user).
timestamp (object): Date and time of the event in the format YYYY-MM-DD HH:mm:ss.
sku (int64): Numeric ID of the item.

remove_from_cart

client_id (int64): Numeric ID of the client (user).
timestamp (object): Date and time of the event in the format YYYY-MM-DD HH:mm:ss.
sku (int64): Numeric ID of the item.

product_properties

sku (int64): Numeric ID of the item.
category (int64): Numeric ID of the item category.
price (int64): Numeric ID of the item's price bucket.
embedding (object): A textual embedding of a product name, compressed using the product quantization method.

page_visit

client_id (int64): Numeric ID of the client.
timestamp (object): Date and time of the event in the format YYYY-MM-DD HH:mm:ss.
url (int64): Numeric ID of a visited URL. The explicit information about what (e.g., which item) is presented on a particular page is not provided.

search_query

client_id (int64): Numeric ID of the client.
timestamp (object): Date and time of the event in the format YYYY-MM-DD HH:mm:ss.
query (object): The textual embedding of the search query, compressed using the product quantization method.

Event type	Number of events
product_buy	~1 700 000
add_to_cart	~5 200 000
remove_from_cart	~1 700 000
page_visit	~150 000 000
search_query	~9 600 000
Total number of unique users	~19 000 000

Download

Universal Behavioral Modeling Dataset © 2025 by Synerise SA is licensed under Creative Commons Attribution-NonCommercial 4.0 International. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/4.0/
Download dataset

We provide a competition repository that includes a baseline solution and a training pipeline, allowing participants to run experiments on open tasks:
Repository

For further details, please refer to the dedicated website of Synerise.

Prize^top

First three teams from the participants - $3500/$2500/$1500
Special prize for first three academic teams - $2500/1500/500

Participation and Data ^top

Registration & Data Access is open now!

Timeline ^top

When?	What?
10 March, 2025	Start RecSys Challenge Release dataset
10 April, 2025	Submission System Open Leaderboard live
25 June, 2025~~15 June, 2025~~	End RecSys Challenge
30 June, 2025 ~~20 June, 2025~~	Final Leaderboard & Winners EasyChair open for submissions
5 July, 2025 ~~26 June, 2025~~	Code Upload Upload code of the final predictions
12 July, 2025 ~~7 July, 2025~~	Paper Submission Due
28 July, 2025 ~~24 July, 2025~~	Paper Acceptance Notifications
7 August, 2025 ~~5 August, 2025~~	Camera-Ready Papers
September 2025	RecSys Challenge Workshop @ ACM RecSys 2025

Paper Submission Guidelines^top

Submission website: EasyChair

All participants of the challenge are invited to submit if they consider their submission particularly effective, novel, otherwise interesting, or exploiting identified particularities of the data.
Note: paper submission is mandatory if you want to be eligible for a prize. Accepted papers are given a presentation slot at the workshop. At least one author of each accepted paper must attend the workshop and present their work. Please note that a badly written paper or absence of presence at workshop, may prevent you from being eligible for the prize. Please contact the workshop organization if none of the authors will be able to attend the workshop.
Page limit: 4 pages + 1 page for references (ACM SIG Format) in double column format. Instructions for Word and LaTeX authors are given below:
- Microsoft Word: Write your paper using ACM’s interim template. Follow the embedded instructions to apply the paragraph styles to your various text elements. The text is in double-column format and no additional formatting is required at this stage.
- LaTeX: Please use the latest version of the Primary Article Template – LaTeX to create your submission.Start the document with the \documentclass[sigconf]{acmart} command to generate the output in a double-column format. Please see the LaTeX documentation and ACM’s LaTeX best practices guide for further instructions, ignoring the single-column instructions. Do not use the “manuscript” option, otherwise the document will not be compiled in double-column, as required. Check the sample-sigconf.tex file included in the template package for a formatting example. To ensure 100% compatibility with The ACM Publishing System (TAPS), please restrict the use of packages to the whitelist of approved LaTeX packages.
Anonymization of submissions is not required; please include your team name in abstract and text, as well as a link to your code repository, the achieved score, and a reference to the RecSys Challenge Website (http://www.recsyschallenge.com/2025/). Note: This will be replaced with a reference to an overview paper in the RecSys proceedings for the camera-ready version.
Submission website: EasyChair
The topics of interest include, but are not limited to:
- Development of novel algorithms for creating Universal Behavioral Profiles.
- Techniques for ensuring the generalization of user representations across both open and hidden predictive tasks.
- In-depth analysis and creative feature engineering of the provided event data (product interactions, page visits, searches, etc.).
- Methods for predicting user churn, as well as product and category propensity, based on user representations.
- Scalable and efficient approaches for generating user profiles from large-scale behavioral data.
- Innovative approaches to user representation that can be applied to a variety of downstream tasks.
The submitted papers will be evaluated based on novelty, clarity, and presented empirical results.
Each paper will be reviewed by at least two PC members.
Our proceedings will be published in the ACM Digital Library within its International Conference Proceedings Series.
Accepted papers must be presented in the RecSys Challenge Workshop.
Important Note on Publication Fees

Please be aware that this workshop follows the new ACM International Conference Proceedings Series (ICPS) model. Under this model, the financial responsibility for the Article Processing Charge (APC) lies with the authors of accepted papers.

However, the RecSys conference organizers are committed to supporting the community. Authors who face genuine difficulties in covering the APC will have the opportunity to submit a motivated request for financial support. These requests will be evaluated by the main conference organizers on a case-by-case basis. Please note that support is not guaranteed.

For more details on the ACM ICPS model, please refer to the ICPS FAQ and organizer guidance.

Workshop Program and Accepted Papers^top

The RecSys Challenge Workshop will take place on September 22nd, 2025
All times are CET

8:30-8:40	Opening Remarks

8:40-9:40	📈 Session 1: Sequential Dynamics and Temporal Modeling
8:40-8:55	Toward Universal User Representations: Contrastive Learning with Transformers and Embedding Ensembles (🥇) --- Yuki Sawada, Rintaro Hasegawa, Yuhi Nagatsuma, Shugo Takei, Kazuhito Yonekawa, and Hiromu Auchi
8:55-9:10	Encode Me If You Can: Learning Universal User Representations via Event Sequence Autoencoding (🥈) --- Anton Klenitskiy, Artem Fatkulin, Daria Denisova, Anton Pembek, and Alexey Vasilev
9:10-9:25	From Sequences to Profiles: Generating Universal Behavioral Profiles exploiting Recurrent Neural Networks (🥇🏛️) --- Simone Colecchia, Mauro Orazio Drago, Jihad Founoun, Paolo Gennaro, Ernesto Natuzzi, Luca Pagano, Sajjad Shaffaf, Giuseppe Vitello, Andrea Pisani, and Maurizio Ferrari Dacrema
9:25-9:40	State-Space Sequential Encoders for Universal Behavioral Profiles --- Keita Nakano, Shusuke Irikuchi, and Akira Komori

9:40-10:40	🧩 Session 2: Hybrid Systems and Feature Fusion
9:40-9:55	Blending Sequential Embeddings, Graphs, and Engineered Features --- Sergei Makeev, Alexandr Andreev, Vladimir Baikalov, Vladislav Tytskiy, Aleksei Krasilnikov, and Kirill Khrylchenko
9:55-10:10	Universal Behavioral Profiles using Graph Neural Networks (🥉) --- Shoji Takimura, Masato Hashimoto, Wataru Akashi, Ryo Koyama, Ryoki Wakamoto, and Kenichiro Miyaki
10:10-10:25	Heterogeneous Feature Integration for Behavioral Profiles (🥈🏛️) --- Kaito Terasaki, Taketo Yoneda, Kiyotakashi Takagawa, Hayato Maruyama, Yongzhi Jin, Hibiki Ayabe, Kei Harada, and Kazushi Okamoto
10:25-10:40	Beyond Aggregation: A Feature-Fused Universal Behavioral Transformer (🥉🏛️) --- Yichen Liu, Minhao Wang, Ruizhi Zhang, Wen Wu, and Wei Zhang

10:40-11:00	Coffee Break

11:00-12:00	💡 Session 3: Novel Representations and Relational Structures
11:00-11:15	One Paradigm for All: Relational Deep Learning for Representation of Universal Behavioral Profiles --- Marco Valentini, Antonio Ferrara, Chiara Mallamaci, Rossana Zampieri, Ilaria Buonfrate, Daniele Malitesta, Fedelucio Narducci, and Tommaso Di Noia
11:15-11:30	Triple-Feature Transformer with Sparsity Regularization --- Xun Zhou and Haichuan Song
11:30-11:45	BEHAV-E! You are Not Just a Number to Us, but an R2048 Embedding --- Juan Manuel Rodriguez and Antonela Tommasel
11:45-12:00	Beyond Model Size: Narrative Driven Universal Modeling --- Alexandre Rousseau and Yann Veyssiere

12:00-13:00	Lunch

14:00-15:00	Keynote: Kim Falk In data we trust?

15:00-15:30	🏆 Awards Ceremony

Keynote Speaker ^top

Kim Falk

In data we trust?

Abstract: Recommender engineers are essential in creating personalized user experiences, harnessing both technical expertise and engineering skills. Understanding user behavior through data is crucial for building better recommendations, but equally important is a solid grasp of business needs and problem contexts. This keynote examines the interplay between models and data, the challenges of bias, and the stochastic elements that influence system behavior. We’ll also explore where trust lies within recommendation systems and how to balance theoretical approaches with engineering realities. By addressing these challenges and trade-offs, this talk provides insights into building scalable, impactful, and trustworthy recommender systems.

Kim Falk is a Principal Recommender Engineer at DPG Media, where he works on recommender systems in news and VOD platforms. Before, he was a staff recommender engineer at Shopify, where he was the technical lead of the Product Recommendations team. Kim has experience in machine learning but specializes in Recommender systems. He has previously worked on recommenders for customers like British Telecom and RTL+, added user segmentation in Sitecore CMS and worked on Danish NLP. Kim is also the author of Practical Recommender Systems.

Organization ^top

Workshop Program Committee ^top

Alejandro Bellogin, Universidad Autonoma de Madrid
Ludovico Boratto, University of Cagliari
Michal Daniluk, Synerise
Dietmar Jannach, University of Klagenfurt
Julia Neidhardt, TU Wien
Igor Sieradzki, Synerise
Ksawery Smoczyński, Synerise
Mikołaj Spytek, Synerise
Gergely Stomfai, Synerise
Antonela Tommasel, ISISTAN Research Institute, CONICET-UNCPBA

RecSys Challenge 2025

About top

Challenge Task top

Evaluation top

DataSet top

Download

Prize top

Participation and Data top

Timeline top

Paper Submission Guidelines top

Workshop Program and Accepted Papers top

Keynote Speaker top