By Jahnavi Kachina, IDEAS

Wayfair is an America E-Commerce fast-growing company for selling furniture and home goods online. As of January 2014, Wayfair was the largest retailer store for furniture and goods in the United States, and 33rd Largest for an online retailer in the United States [1]. The company also has generated profit as shown in figure1 a). but the company is not able to make a profit since 2014 it went to the public according to figure1 b).

Figure 1 a) Wayfair Revenue from 2010-2017
Figure 1 b) Wayfair loss from 2014-2019

On 14th February company has announced a laying-off total of 550 employees including 350 from the Boston headquarter. The CEO of the company has confessed that the company invested money in different areas in an uneven amount. For example, advertisement, hiring new employees and investing profit back into the business for growth. This shows the company focused on short term management rather than the long term. To add, the company hired almost fourth-fold employees after 2016. The second bad investment is the advertisement, they spend money on marketing and advertisement that were not profitable. The company is successfully grown to remarkable size from the year 2002 to 2014. Although, the company invested most of the revenue back to grow the company bigger and bigger. Even though Wayfair has steady customer demand but the revenue graph as shown in above figure1 is not been impressive. It confessed by Edward Yruma  “Wayfair has a very powerful consumer offering, but profitability remains even more elusive,” analyst Edward Yruma wrote at the time. That is true because selling furniture and shipping is more expensive than compared to Amazon’s products. The below paragraphs will provide information about Wayfair data science advancement [2].

Jupyter-friendly Data-science tools at the Wayfair:

Wayfair data-science tools have major two advantages 1) It allows new users to get onboard on an existing project from day one. 2) Allows self-serve a small handful of functions to implement a new project. Wayfair has developed data-science software in three sections.

Section 1: RoSE (Room and Style Estimator)

To understand customers, need, Wayfair uses RoSE computer vision to identify customer’s test based on the image. RoSE is a VGG network [1] trained on over 800k room images. Customers select one style from given Images and based on features RoSE can make decisions [3].

Figure 2 The Computer Vision ETL pipeline. [3]

Break down the pipeline into four steps:

  1. Get image resource locations and other metadata: Collecting data in the form of query.
  2. Pre-process images and copy locally: Applying Pre-processing functions to the image and save it for training, e.g. scaling, padding, perturbing, etc.
  3. Build training instances and labels: Here RoSE there are two approaches.
    • Predicts which style won most of the majority of experts’ votes.
    • Identifies the top two images from them.
  4. Perform inference/training: Uses Keras generators that create image batches more quickly than they are consumed in a background process.

Section 2 : MVC(Model-view-controller) Framework

The basic thee components of MVC are Controller takes input from users, Model understands the required steps to do, and View replies to the user through the application. This is built to provide customer satisfactory experience that their product will be delivered safely [3].

Figure 3 MVC framework that isolates the data science part of data science. [3]

Inside the MVC framework, there are three major components

  1. Controller: User’s instruction for the application what short of actions required. For example, the checkout button.
  2. Model: Guidelines of how to take those actions. For example, backend integrations with order processing.
  3. View: The response from application to the user. Such as, loading confirmation of order page, and sending an email of confirmation.

Section 3: Hoover—Image Ingestion Pipeline Client

In this section, its controllers control actions based on the user to save each change into the database called “controller actions”. Data scientists import all the controllers into their notebook to perform all the actions in sequence [3].

Figure 4 The ETL pipeline steps and its mapping onto controller actions. [3]

The tool creation workflow can be described as

  1. Divide the whole process into sub sections
  2. Separate the sub sections where data science efforts are required.
  3. Map those sub sections into MVC data frame

The functions used in each specific project live alongside model code in the project reports. This keeps proper encapsulation of paperwork and code for each individual project.

In conclusion, Wayfair uses model-view-controller(MVC) architecture to generate lightweight, Jupyter friendly tools in data pipeline abstractions and data scientists spend so much time to generate images that meet user’s requirements. Hoover empower data scientist helps to a newbie to get onboard on existing projects. Additionally, new ETL pipelines allow the data scientist to focus more on data science tasks instead of software engineers. Hence, it is clear that Wayfair data science technology is up to date and effective, but their investment decisions were not accurate.


  1. Wikipedia, “,” [Online]. Available:
  2. B. News, “,” 13 02 2020. [Online]. Available: business/2020/02/13/wayfair-layoffs.
  3. Wayfair, “,” 11 02 2020. [Online]. Available:

Live 11/10: Building Trust In An Age of AI And Data Science (Recap)

Nov 15, 2018

With the Facebook & Cambridge Analytica scandal keeping security practitioners and their lawyers up at night, and rightly so, it is very likely there is second guessing going on over the digital transformation happening in organizations. Across all functions — marketing, sales, LOBs, support — the “go digital, go cognitive, go AI at all costs” mantra is being questioned. In this week’s webinar, Ashwin Krishnan delves into the ethics of today’s technologies.

Data Science in Healthcare Industry

Sep 18, 2018

Author: Zihuan Tang

The coming conference in Oct is about 3000 attendees in Los Angeles Convention Center. Healthcare and data science are the hot topics in the conference. Here is the conference page: 

Data Science in healthcare is supposed to provide rigorous quantitative training and essential statistical and computing skills necessary to manage and analyze health science data to address important questions in public health and biomedical sciences.

With about 1.2 billion clinical documents being produced in the United States annually, doctors and scientists have a wealth of data to promote their research. Additionally, large amounts of health-related information are made accessible through widespread adoption of wearable computing technology, which opens up new opportunities for better, more informed healthcare.

Job List 11 – IDEAS Southern California Conference 2018

Oct 02, 2018

Welcome the upcoming IDEAS Southern California Conference 2018!

In the SoCal Conference 2018, we are honored to invite many professional speakers with a background in data science. In particular, these speakers provide a rare opportunity to recommend jobs.

Come to join our Southern California Conference 2018, and meet up with our speakers.

Leave a Comment

Your email address will not be published. Required fields are marked *

Comment *