HackerEarth Machine Learning Challenge: Exhibit A(rt)
108/3958 Ranked Leaderboard
An art exhibitor is soon to launch an online portal for enthusiasts worldwide to start collecting art with only a click of a button. However, navigating the logistics of selling and distributing art does not seem to be a very straightforward task; such as acquiring art effectively and shipping these artifacts to their respective destinations post-purchase.
Task
The exhibitor has hired you as a Machine Learning Engineer for this project. You are required to build an advanced model that predicts the cost of shipping paintings, antiques, sculptures, and other collectibles to customers based on the information provided in the dataset.
Dataset Description
The dataset folder contains the following files:
- train.csv: 6500 x 20
- test.csv: 3500 x 19
- sample_submission.csv: 5 x 2
The columns provided in the dataset are as follows:
Column name | Description |
Customer Id | Represents the unique identification number of the customers |
Artist Name | Represents the name of the artist |
Artist Reputation |
Represents the reputation of an artist in the market (the greater the reputation value, the higher the reputation of the artist in the market) |
Height | Represents the height of the sculpture |
Width | Represents the width of the sculpture |
Weight | Represents the weight of the sculpture |
Material | Represents the material that the sculpture is made of |
Price Of Sculpture | Represents the price of the sculpture |
Base Shipping Price | Represents the base price for shipping a sculpture |
International | Represents whether the shipping is international |
Express Shipment | Represents whether the shipping was in the express (fast) mode |
Installation Included | Represents whether the order had installation included in the purchase of the sculpture |
Transport | Represents the mode of transport of the order |
Fragile | Represents whether the order is fragile |
Customer Information | Represents details about a customer |
Remote Location | Represents whether the customer resides in a remote location |
Scheduled Date | Represents the date when the order was placed |
Delivery Date | Represents the date of delivery of the order |
Customer Location | Represents the location of the customer |
Cost | Represents the cost of the order |
Evaluation metric
score = 100*max(0, 1-metrics.mean_squared_log_error(actual, predicted))