About^top

The RecSys 2023 Challenge will be organized by Sarang Brahme, Rahul Agarwal (ShareChat), Abhishek Srivastava (IIM Visakhapatnam, India), Liu Yong (Huawei, Singapore) and Athirai Irissappane (Amazon, USA) based on the data provided by ShareChat. This year’s challenge will focus on online advertising, improving deep funnel optimization, and user privacy.

The challenge is brought to you by ShareChat. ShareChat is India’s largest homegrown social media company, with 400+ million MAUs across all its platforms. Headquartered in Bengaluru, ShareChat is spreading its team globally across India, the USA, and Europe. We have the best-in-class AI & ML technology and the strongest feed ranking system powering our growth. We aim to create a million monetizable creators with USD 450 million in creator earnings across ShareChat and Moj by 2025.

Challenge Task^top

Online advertising has been a multi-billion dollar industry since the early 2000 and has played a significant role in the growth of the internet. The key advantage of online advertising over conventional mass advertising is its inherent ability to personalize to users, democratizing advertising and enabling businesses of all sizes to participate, and providing the measurable impact of money spent to the advertisers. Over the past two decades, the nature of online advertising has also evolved tremendously from pure banner-based advertising, where advertisers were charged based on the number of ad impressions, to deep funnel optimizations, where advertisers can optimize for eventual sales.

The efficacy of deep funnel optimization required extensive personalization and opened up rich problems in real-time auction design, large-scale machine learning, modeling delayed feedback, and behavioral understanding. As these systems matured, we also started developing a rich understanding of the need to preserve user privacy, ensure AI fairness, and prevent adversarial exploitation of the platform. In this challenge, we aim to provide a real-world ad dataset from the Sharechat and Moj apps to act as a benchmark for research into deep funnel optimization with a focus on user privacy

DataSet^top

The dataset corresponds to roughly 10M random users who visited the ShareChat + Moj app over three months. We have sampled each user's activity to generate 10 impressions corresponding to each user. Our target variable is whether there was an install for an app by the user or not.
To represent a user, several features are provided:
1. Demographic features: These include age, gender, and geographic location from where the user is accessing the Sharechat/Moj app. The sampling of the users in (1) is done such that we have an approximately uniform distribution of users across the demographic features. The user's location is hashed to a 32-bit to anonymize the data.
2. Content preference embeddings: These embeddings are trained based on the users' consumption of the various non-ad content on the Sharechat/Moj app.
3. App affinity embeddings: These embeddings are trained based on the past apps installed by the user on our platform.
We also have features corresponding to ads
1. Ad categorical features: These features represent different characteristics of an ad, including the size of the ad, the category of the ad etc. The features are hashed to 32-bit to anonymize the data
2. Ad embedding: These represent the actual video/image content of the ad.
To capture the historical interactions between users and ads, we also provide
1. Count features: These features represent the user interaction with ads, advertisers, and categories of advertisers over different lengths of a time window
Every row of the data has an associated numeric id and represents an ad impression shown to the user and whether it resulted in a click on the ad and subsequently an install or not.
We do not provide the semantics of the individual features.
The training data consists of subsampled impressions/clicks/installs from the past 2 weeks and aims to predict the probability of install for the 15th day.

Prize^top

First three teams from the participants - $2500/$1500/$1000
Special prize for the academic teams - $1500

Participation and Data ^top

Registration & Data Access is open now!

Timeline ^top

When?	What?
27 March, 2023	Start RecSys Challenge Release dataset
11 Apr, 2023	Submission System Open
13 Apr, 2023	Leaderboard live
22nd June, 2023 ~~18 June, 2023~~	End RecSys Challenge
28th June, 2023 ~~24 June, 2023~~	Final Leaderboard & Winners EasyChair open for submissions
3rd July, 2023 ~~30 June, 2023~~	Code Upload Upload code of the final predictions
14 July, 2023	Paper Submission Due
1 August, 2023	Paper Acceptance Notifications
14 August, 2023	Camera-Ready Papers
Sept 3rd week	RecSys Challenge Workshop @ ACM RecSys 2023

Paper Submission Guidelines^top

Submission website: EasyChair

All participants of the challenge are invited to submit if they consider their submission particularly effective, novel, otherwise interesting, or exploiting identified particularities of the data.
Note: paper submission is mandatory if you want to be eligible for a prize. Accepted papers are given a presentation slot at the workshop. At least one author of each accepted paper must attend the workshop and present their work. Please note that a badly written paper or absence of presence at workshop, may prevent you from being eligible for the prize. Please contact the workshop organization if none of the authors will be able to attend the workshop.
Page limit: 7 pages + references (ACM SIG Format) Note: This is in reference to the latest single-column ACM template.
Anonymization of submissions is not required; please include your team name in abstract and text, as well as a link to your code repository, the achieved score, and a reference to the RecSys Challenge Website (http://www.recsyschallenge.com/2023/). Note: This will be replaced with a reference to an overview paper in the RecSys proceedings for the camera-ready version.
Submission website: EasyChair
The submitted papers will be evaluated based on novelty, clarity, and presented empirical results.
Each paper will be reviewed by at least three PC members.
Our proceedings will be published in the ACM Digital Library within its International Conference Proceedings Series.
Accepted papers must be presented in the RecSys Challenge Workshop.

Workshop Program and Accepted Papers^top

The RecSys Challenge Workshop will take place on September 19th, 2023
All times are SGT

Time	Session
9:00-9:15	Opening
9:15-9:30	A Simple and Robust Ensemble For Click-Through Rate Prediction Xingmei Wang and Yankai Wanga
9:30-9:45	Predicting Conversion Rate in Advertising Systems: A Two-Stage Approach with LightGBM Lulu Wang, Yu Zhang, Huayang Zhao, Zhewei Song and Jiaxin Hu
9:45-10.00	Integrating Explicit and Implicit Feature Interactions for Online Ad Installation Forecasting Jiawei Jiang, Bing Wang and Jingyuan Wang
10:00-10:15	Capturing Performance and Privacy by Assembling Avengers of Online Advertising Taehee Kim, Seungyun Baek, Taehyeon Jeon, Hojin Jung, Joonhong Kim and Taeho Lee
10:15-10:30	Lightweight Boosting Models for User Response Prediction Using Adversarial Validation Hyeonwoo Kim and Wonsung Lee

10:30-11:15	Coffee Break

11:15-11:30	Robust User Engagement Modeling With Transformers and Self Supervision Yichao Lu and Maksims Volkovs
11:30-11:45	Pessimistic Rescaling and Distribution Shift of Boosting Models for Impression-Aware Online Advertising Recommendation Paolo Basso, Arturo Benedetti, Nicola Cecere, Alessandro Maranelli, Salvatore Marragony, Samuele Peri, Andrea Riboni, Alessandro Verosimile, Davide Zanutto and Maurizio Ferrari Dacrem
11:45-12:00	Graph Enhanced Feature Engineering for Privacy Preserving Recommendation Systems Chendi Xue, Xinyao Wang, Yu Zhou, Poovaiah Palangappa, Rita Brugarolas Brufau, Aasavari Dhananjay Kakne, Ravi Motwani, Ke Ding and Jian Zhang
12:00-12:15	A Simple yet Strong Approach for Installation Prediction in ShareChat Ads Xiaoteng Shen, Liangcai Su, Zhutian Lin and Xiao Xi
12:15-12:30	Large Scale CVR Prediction through Hierarchical History Modeling Qi Zhang, Zhibin Zhang, Biao Lu, Bangzheng He and Liangbi Li
12.30-12:35	Closing remarks

Organization ^top

Workshop Program Committee ^top

Saikishore Kalloori, ETH Zurich
Olivier Jeunen, ShareChat UK
Manoj Reddy, University of California
Bruce Ferwerda, Jönköping University
Luca Belli, Twitter Cortex
Ludovico Boratto, University of Cagliari
Dietmar Jannach, University of Klagenfurt
Marko Tkalcic, University of Primorska

RecSys Challenge 2023

About top

Challenge Task top

DataSet top

Prize top

Participation and Data top

Timeline top

Paper Submission Guidelines top

Workshop Program and Accepted Papers top

Organization top

Workshop Program Committee top