HackerEarth Machine Learning Challenge: How NOT to lose a customer in 10 days

A collection of codes submitted for machine learning competitions on various platforms


HackerEarth Machine Learning Challenge: How NOT to lose a customer in 10 days

Mar 16, 2021, 07:30 PM IST - Apr 15, 2021, 07:30 PM IST

245/4293 Ranked Leaderboard

leaderboard

feature image

### Problem

Churn rate is a marketing metric that describes the number of customers who leave a business over a specific time period. . Every user is assigned a prediction value that estimates their state of churn at any given time. This value is based on:

  • User demographic information
  • Browsing behavior
  • Historical purchase data among other information

It factors in our unique and proprietary predictions of how long a user will remain a customer. This score is updated every day for all users who have a minimum of one conversion. The values assigned are between 1 and 5.

Task

Your task is to predict the churn score for a website based on the features provided in the dataset.

Data description

The dataset folder contains the following files:

  • train.csv: 36992 x 25
  • test.csv: 19919 x 24
  • sample_submission.csv: 5 x 2

The columns provided in the dataset are as follows:

Column name Description
customer_id Represents the unique identification number of a customer
Name Represents the name of a customer
age Represents the age of a customer
security_no Represents a unique security number that is used to identify a person
region_category Represents the region that a customer belongs to 
membership_category Represents the category of the membership that a customer is using
joining_date Represents the date when a customer became a member 
joined_through_referral Represents whether a customer joined using any referral code or ID
referral_id Represents a referral ID
preferred_offer_types Represents the type of offer that a customer prefers
medium_of_operation Represents the medium of operation that a customer uses for transactions
internet_option Represents the type of internet service a customer uses
last_visit_time Represents the last time a customer visited the website
days_since_last_login Represents the no. of days since a customer last logged into the website
avg_time_spent Represents the average time spent by a customer on the website
avg_transaction_value Represents the average transaction value of a customer
avg_frequency_login_days Represents the no. of times a customer has logged in to the website
points_in_wallet Represents the points awarded to a customer on each transaction 
used_special_discount Represents whether a customer uses special discounts offered
offer_application_preference Represents whether a customer prefers offers 
past_complaint Represents whether a customer has raised any complaints 
complaint_status Represents whether the complaints raised by a customer was resolved 
feedback Represents the feedback provided by a customer
churn_risk_score Represents the churn risk score that ranges from 1 to 5

Evaluation metric

score = 100 x metrics.f1_score(actual, predicted, average="macro")

Result submission guidelines

  • The index is customer_id and the target is the churn_risk_score column. 
  • The submission file must be submitted in .csv format only.
  • The size of this submission file must be 19919 x 2.

Note: Ensure that your submission file contains the following:

  • Correct index values as per the test file
  • Correct names of columns as provided in the sample_submission.csv file