proposal driller...proposal driller motivation behind proposal driller motivation - futurice wanted...
TRANSCRIPT
05.10.2020 Rachhek Shrestha
Proposal DrillerSearch engine for project proposal documents
BERLIN • HELSINKI • LONDON • MUNICH • OSLO • STOCKHOLM • STUTTGART • TAMPERE
1. Motivation behind Proposal Driller
2. Screenshots
3. An example of a proposal document
4. Proposal Driller’s architecture diagram
5. Tools used to build and evaluate Proposal Driller
6. Challenges
INDEX
BERLIN • HELSINKI • LONDON • MUNICH • OSLO • STOCKHOLM • STUTTGART • TAMPERE
Proposal Driller
Motivation behind Proposal Driller
Motivation
- Futurice wanted their salespeople to become more customer-centric by utilizing their organization’s data with the help of machine learning and data mining.
What does it mean to be customer-centric?
- Understanding the clients by putting them at the core of what you do.- Pursuing customer happiness and making customer’s success as your own success.
Why do salespeople want to be customer-centric?
- For making a good impression- For building trust- For gaining and growing the business - For being a strategic long term partner
Proposal Driller
Motivation behind Proposal Driller
How do people in sales try to understand their client?
- They search as much information as they can related to their client.- They want to search for past projects of the customer, similar customer challenges, technology, problems, solutions,
colleagues, etc- For example, project proposal documents are an important source of such knowledge.
What are the pain points for searching proposal documents?
- Project proposal documents are stored in a shared Google Drive and are scattered.- Searching for proposal documents in Google Drive has been a paint point for sales people.- Google Drive search is not user friendly, is time consuming, chaotic and frustrating experience.
Proposal Driller
Screenshots
Proposal Driller
Screenshots
Proposal Driller
Screenshots
Proposal Driller
Proposal document example
Sections of a typical proposal:1. Title page2. Table of contents3. Company introduction4. Problem definition5. Solution definition6. Team members7. Project plan and pricing8. Reference projects from the past9. Ending slide
Proposal Driller
Architecture
Proposal Driller
Tools used
Document acquisition
- Collected 1300 documents (PDF + Google Slides) from the Google Drive using the Drive API and TypeScript.
- Only the file names that contain the word ‘proposal’ or ‘offer’ in the filename.
- The fields collected were as follows:- File name- File type- Created time- Total number of pages- Text separated by pages- Last modifying user
- The documents were stored in Elasticsearch
Google Drive API TypeScript
Proposal Driller
Tools used
Document enrichment through information extraction
- Problem & solution definition, customer name (using simple heuristic)- Emails (using regex pattern matching)- Employee names and job titles (using Spacy library’s Named Entity Recognition
trained on employee names/job titles from company database)- Document summary and keywords (using Gensim’s summarizer function)- Topics (using Topic modeling in R made by a colleague)- Top 5 Similar documents list (using Gensim’s Doc2Vec library)- Vector embedding of document (using Python implementation of Tensorflow’s
Universal Sentence Encoder)- In addition, data sources from various company’s database was also merged
using Python.
Python Spacy: library for NLP Gensim: library for topic modeling and NLP
Tensorflow: used for converting documents to embedded vectors
Proposal Driller
Tools used
Search engine platform
- Elasticsearch is an open source search engine- Provides lot of features out of the box- Inverted index for storing documents- Indexing API for storing documents into the inverted index- Search API for searching the documents that are stored in the inverted
index- Query Domain Specific language for building complicated queries to
match and rank the documents- Feature to search by keyword and search by embedding vectors
Elasticsearch hosted in Amazon Web Services Elasticsearch query examples
Proposal Driller
Tools used
Backend
- Used as an intermediary between the frontend application and the elasticsearch API
- Handled user interaction logging- Handled query processing- Used Tensorflow JS to convert the queries to embedded vector form
Javascript version of Tensorflow that runs on the browser/web
Node js for backend server
Proposal Driller
Tools used
Search engine user interface
- Made using Elasticsearch search- ui. It is a React JS framework for creating search user interfaces easily.
- Provides lot of rich features like filters, pagination, highlighting etc. out of the box
- Works great with Elasticsearch search platform- Easily customizable- Front end features for Proposal Driller
- Home page showing recent searches, opened documents etc.- Search box with query autocomplete- Filters- Sorting- Pagination- ‘Did you mean’ queries- Search results pane with expandable icons- Relevance judgment collector- User interaction logging- Query version selector
Proposal Driller
Tools used
Evaluation of search results
- Evaluation of usefulness of functionality- Done by collecting user ratings through a questionnaire
- Evaluation of extracted information- Done manually for a small subset of documents
- Evaluation of relevance of search results- Done by gathering relevance judgments for a documents over a
representative set of search queries- Used Precision @ K and Discounted Cumulative Gain (DCG)- Elasticsearch has evaluation APIs that make it easier to calculate
these scores
Proposal Driller
Challenges
- Non technical challenges- Talking with salespeople and understanding the core of their problem.- Understanding the meaning of relevancy that fits everyone
- Technical challenges- Real world data is messy
- Not having a fixed format of proposal documents- Irrelevant information inside the proposals
- Not having a pre-defined evaluation dataset- Made it hard to evaluate the application
Thank you!Kiitos!Danke!Tack!
BERLIN • HELSINKI • LONDON • MUNICH • OSLO • STOCKHOLM • STUTTGART • TAMPERE
- Rachhek Shrestha- Data Scientist
https://www.linkedin.com/in/rachhekshrestha/