Introduction to how EDITED utilize various machine learning models
In EDITED Market we track over 1 billion products worldwide! In order to organise such a large dataset into globally recognised product categories, we leverage machine learning technology to do this automatically based on the data we collect directly from the retailer websites.
So What Is Machine Learning?
Machine learning is a process in which you teach a machine/computer to recognize patterns in data.
There are many types of machine learning models. At EDITED, the main type of machine learning models we use are:
- Supervised Learning
- Semi-Supervised Learning
- Unsupervised Learning
Supervised: Supervised learning is the process of training the model on labelled data, which allows the model to predict the outcome. Data labelling is the process of identifying certain properties or characteristics and providing context to the raw data so that the machine learning model can learn from it.
At EDITED we use many supervised models by collecting large amounts of labelled data directly from the retailer website to help the model predict what vertical, category and subcategory the product should fall into within the platform; for example, the difference between Apparel & Homeware products, whether that product is a skirt or a top, or even a shirt or a blouse.
Unsupervised: Unsupervised learning uses unlabeled data, meaning the model learns patterns from the raw data without the identifying label inputs that are used in supervised or semi-supervised learning. The model would automatically group common characteristics within the data set which can later be analysed to find a meaning between their groups .
An example of how this has been utilized at EDITED is in our Markets > Segment grouping. (Whether a retailer is Value, Mass, Premium or Luxury). Starting with a large database of retailers containing lots of information (like number of products, average price etc for each retailer) an unsupervised model would automatically group these retailers into groups. Our in-house data specialists would later analyze the groups and interpret them, trying to find some meaning behind their grouping. You can see in the example diagram, we found a correlation behind the grouping of the retailers based on their market level:
Semi-supervised: Semi-supervised learning is a combination of supervised and unsupervised learning. It uses a small amount of labelled data and a large amount of unlabeled data.
For example, at EDITED, some models would take too long to collect enough labelled data that would allow us to train a fully supervised model. So instead we can rely on semi supervised learning. This means we collect a small amount of labelled data, feed it to the model and allow it to collect its own training data based on the patterns it discerns from the small labelled set. Semi-Supervised learning is used to categorise different styles within a sub-category in EDITED ie. inputting certain labels to help identify if a trouser is flared, tuxedo or cargo.
To see more examples of this in practice at EDITED, see the ‘Making sense of the data’ section of our ‘Introduction to EDITED data’ article.
If you have any questions or for more information, please reach out to support@edited.com