faster document review and production
TRANSCRIPT
Stu Van DusenLexbe
March 16th, 2016
A Lawyer's Guide to Faster Document Review & Production
Best Practices for Leveraging Attorney & Staff time with Computer-Assisted Search, Document Clustering and Predictive Coding
○ Webinars take place monthly covers a variety of relevant eDiscovery topics.
○ If you have technical issues or questions, please email [email protected].
What attendees are saying:○ "Excellent presentation! One of the best webinars I have attended!"
○ “Time well spent.”
○ "Great in terms of content and presentation. Thanks!"
○ "Excellent and informative piece!"
Info & Future
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
eDiscovery Webinar Series
○ eDiscovery Solutions Consultant of Lexbe LC, a provider of cloud-based litigation processing, review and document management software & eDiscovery services
○ Specializes in working with firms without a full in-house department handling eDiscovery which are involved in the type of complex litigation that requires a high level of precision and eDiscovery expertise to gain the advantage in the discovery phase of trial.
Stu Van Dusen800-401-7809 x55
Stu Van Dusen Bio
eDiscovery Webinar Series
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
● Increasingly Document-Intensive Cases & Linear Reviews
● What are Technology Enhanced Reviews?
● When Should Technology Enhanced Reviews be Considered?
● Modern High-Speed Keyword Search
● Grouping Similar Documents for Grouped Review
● Uses and Applications of Predictive Coding
● Summary
Agenda
Faster Document Review & Production
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Cases Continue to Grow in Size
5
3
1
2005 2010 2015 2020Source: IDC Digital Universe Study* 1 Zettabyte = 1 Trillion Gigabytes
Zettabytes*
VoipEmail
iPhones Peer-to-Peer
Online StorageDigital Cameras
Facebook | LinkedIn DropBox | Backup Devices
Elastic Storage | SaaS | Google StreetsPersonal Blogs | Skype | World Satellite Images
Personal Scanners | Customer Service Recordings Public Webcams | Google Drive | Netbooks | Cloud Instance Servers | PaaS
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
The Challenge of Document-Intensive Cases
Decreasing Document Volume
Increasing Document Relevance
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Linear Review + Increasing ESI Volumes = High Costs
N. Pace and L. Zakaras, “Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery” (RAND Institute for Civil Justice 2012)
CASE STAGECollection 8%
Processing 19%
Review 73%
Total 100%
SOURCEInternal 4%
eDisc Providers 26%
Outside Counsel 70%
Total 100%
Best opportunities for further cost savings will be technologies and process improvements that increase attorney review efficiencies.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
What is a Technology Enhanced Review?
● Technology enhanced reviews are those in which additional applications, algorithms, or indexes are applied to a document set in order to support the logical grouping of documents or automatic coding of documents based on some degree of human input.
● Litigators should consider applying these technologies to their review workflows and methodologies when some resource (time, money, or people) is critically constrained on a case.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Modern Keyword Search ● Early Stage Culling - Reduce amount of ESI to be reviewed by using
keywords to cull document collections.
● Keyword-Based Responsive & Privilege Review - Construct search queries to return documents that are likely to be responsive, confidential. Search by name and email of counsel; privilege, work-product, confidential and related keywords.
● ID Documents for Depo Prep - Find and assign key documents related to specific case participants to prepare for depositions. Search by email addresses used, names and nicknames used, important issues associated with deponent.
● ID of Key Docs for Trial - Find and mark key case documents. Code documents that will be needed for trial.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Modern Keyword Search Benefits
● Fast - Keyword search is very fast compared with other document search methodologies.
● Inexpensive - Good results can be obtained at little cost compared with manual review or other computer assisted methodologies.
● Quality - Search can deliver high quality results, particularly if keyword terms are carefully developed and tested.
● Avoids Manual Review Errors/Inconsistencies - Search results are computer generated, and so avoid known human review errors that can result from fatigue, inadequate training, lack of focus, etc.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Multi-Index Based Keyword Search
Benefits of Multi-Index Approach
● Keyword search is supported best by indexes created from text extracted from Native files (email, attachments, spreadsheets, etc.) and a paginated file converted from Native files into PDF or TIFF and OCRed.
● Most comprehensive approach and minimizes potential of lost data.
Index MethodCaptures
Embedded Text
Captures Text Excluded From
PrintCaptures
Hidden Text
Imaged/OCR Yes No No
Native Extraction No Yes Yes
Lexbe Multi-Index Yes Yes Yes
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Multi-Index Based Keyword Search● Native extraction will not index embedded body content
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Multi-Index Based Keyword Search● Image/OCR will not index embedded Speaker’s Notes
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Multi-Index Based Keyword Search● Multi-Index Approach Captures Everything
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Near Duplicate Detection
● NearDup technology automatically recognizes similar documents within an e-discovery document collection
● Algorithm analyzes, evaluates and compares the actual text content of the documents to each other
Unstructured Documents NearDup Groupings
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Near Duplicate Detection
There are 4 main applications of NearDup analysis:
1) Grouping similar documents:● Bunch highly similar documents together for more efficient
coding and review
2) Finding hidden ‘key’ or ‘hot’ docs:● Retrieve and mark unseen documents that have content highly
related to existing ‘hot’ or ‘key’ documents
3) Preventing the inadvertent release of privileged information● Be automatically alerted to files containing similar content to
documents that have already been coded as privileged
4) Enable email threading:● Maintain relationships between email conversations
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
NearDup Groupings - Faster Responsive Review
Benefits
Accelerate document review by batch coding (using multidoc edit) larger groups
Increase coding consistency of batched documents
Reduce privilege errors
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
NearDup Groupings - Email Threading
Benefits
View email chains with similar text in date & time order
Avoid confusion of emails only tangentially related (<50% text overlap)
Consistently code email chains for responsiveness, privilege, attorney-eyes only, etc.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
NearDup Groupings - Preventing Privilege Waiver
Benefits
Reduce privilege errors
Avoid sole reliance on human coding consistency
Establish safeguards to help maintain privilege
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ Predictive coding allows a skilled reviewer to train a computer algorithm to identify responsive and non-responsive documents in a litigation document collection.
○ As an alternative to manual linear review, predictive coding can drastically reduce the amount of time needed to review increasingly large ESI volumes.
What is TAR/Predictive Coding?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
CASE STAGECollection 8%
Processing 19%
Review 73%
Total 100%
○ Best opportunities for further cost savings will be reducing review costs.
○ Technologies and process improvements, like TAR, reduce costs by increasing attorney review efficiencies
Why Use TAR/Predictive Coding?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Increase Review Speed: TAR is designed to complete review of large ESI collections faster than human reviewers. Applying TAR in a scalable environment maximizes the speed advantage of predictive coding.
Decrease Review Costs: Whether paying per document or per hour, TAR is significantly less expensive than exhaustive manual review.
Increase Review Quality: Many studies conclude that the presumed quality advantage of ‘gold-standard’ manual review is not accurate. TAR can support defensible, high-quality review outcomes.
Why Use TAR/Predictive Coding?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ A randomized sample of ~ 2,400 documents, a seed set, is selected from the collection.
○ A skilled document review professional reviews and codes the seed set.
○ The coding decisions made in reviewing the seed set train the predictive coding algorithm to identify responsive content in the remaining documents.
How Does TAR/Predictive Coding Work?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ Iterative samples of 25 computer-reviewed documents, control sets, are inspected for coding algorithm accuracy.
○ The responsiveness designation assigned to the document by the computer is either confirmed or overturned.
○ An F-score - derived from precision and recall measures - indicates the stability of the TAR results.
How Does TAR/Predictive Coding Work?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ The TAR algorithm reviews the document collection based on how it was trained during seed set coding and control set review.
○ Remaining Documents are tagged as responsive/non-responsive.○ The speed at which the document collection is reviewed by the
TAR algorithm is largely based on the computing resources applied to the task.
How Does TAR/Predictive Coding Work?
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
TAR/Predictive Coding results (F-scores) indicate:
○ What proportion of the responsive documents were found by the algorithm within a particular margin of error (recall)
○ What percentage of documents marked responsive are actually responsive within a particular margin of error (precision)
Understanding TAR/Predictive Coding Results
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Precision: A measure of how often the algorithm accurately predicts a document to be responsive; the percentage of produced documents that are actually responsive.
Recall: A measure of what percentage of the responsive documents in a data set have been classified correctly by the algorithm.
F-Score: Harmonic mean of precision and recall.
**Note: F1 scores should not to be interpreted as a measure of review quality but rather as an indication of 1) how well the case lends itself to TAR and 2) the quality of the seed set training.
Understanding Results: Precision & Recall
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
High Recall, High Precision: All of the responsive documents in the collection were appropriately coded by the algorithm (high recall). All of the documents produced are actually responsive (high precision). Best possible outcome.
Understanding Results: Precision & Recall
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Low Recall, High Precision: Many of the responsive documents in the collection were not appropriately coded by the algorithm (low recall). However, a high percentage of the documents produced are responsive (high precision). Increased risk of under-producing.
Understanding Results: Precision & Recall
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
High Recall, Low Precision: All of the responsive documents in the collection have been appropriately tagged by the algorithm (high recall). However, many erroneous documents were incorrectly marked responsive (low precision).
Understanding Results: Precision & Recall
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
From the Sedona Conference Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery:
“[T]here appears to be a myth that manual review by humans of large amounts of information is as accurate and complete as possible … Even assuming that the profession had the time and resources to continue to conduct manual review of massive sets of electronic data sets (which it does not), the relative efficacy of that approach versus utilizing newly developed automated methods of review remains very much open to debate.” (2007)
From the TREC (Text Retrieval Conference) Legal Track:
“Overall, the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort...Future work may address which technology-assisted review process(es) will improve most on manual review, not whether technology assisted review can improve on manual review.” (2009)
Comparing Outcomes: TAR v. Manual Review
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Defensibility: Without understanding how a particular TAR/predictive coding methodology works, it becomes difficult to explain why the algorithm made certain coding decisions.
TAR is No Panacea: TAR is not meant to be used in any and all review situations. Without understanding how a particular TAR/predictive coding methodology works, it is impossible to determine if it is appropriate for your case.
The Importance of Transparency
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ In TAR, Bayesian Probability models the likelihood of something being true about a document, i.e. responsive, based on the millions of data connections created while training the seed set.
○ A Naive Bayesian Classifier, used in Assisted Review+, is a probability model with assumptions that allow for pattern recognition among multiple independent variables.
The Importance of Transparency: Assisted Review+
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
Incoming TAR Project
Reviewed Documents
Incoming TAR Project
Reviewed Documents
○ Applying more server resources to a TAR/predictive coding task will increase throughput.
○ TAR offers an exponentially faster workflow compared to manual review. Leveraging scalable architectures maximizes the value of this benefit.
The Importance of Scalability
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Faster Document Review & Production
○ TAR/Predictive Coding allows a skilled reviewer to train a computer algorithm to identify responsive and non-responsive documents .
○ You can use TAR/Predictive Coding to increase review speed, decrease review costs, and improve the quality of review results
○ TAR works by teaching a seed set, testing the algorithm against control sets, and applying the improved algorithm to the remainder of the collection
○ Predictive coding performance results are communicated in the form of precision and recall scores
○ It is important to know the underlying logic of the TAR algorithm to interpret, explain, and defend your results.
○ Scalable, transparent predictive coding workflows maximize the intended benefits of technology assisted review.
Summary
Review
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
We’ll be making the following available to webinar attendees:
● A recorded streaming version● MP3 podcast● Webinar slide-deck
Please let us know if you have any questions or comments about this webinar or suggestions for future topics. This webinar is part of the Lexbe eDiscovery Webinar Series. For notices of future live and on-Demand webinars as part of this series please email us at [email protected] or Follow us on LinkedIN.
Please contact us with any questions:
Thank You For Attending
Thank You
SpeakerStu VanDusen800-401-7809 x55 [email protected]
ModeratorGene [email protected]
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016
Lexbe Sales [email protected]
(800) 401-7809 x22
‘Cost-effective eDiscovery’ “A powerful litigation document management service”
“Because of the Lexbe software, the entire playing field has been leveled for my firm.”
‘Lexbe cost advantages, SaaS convenience and search capabilities appeal to many small firms
“Lexbe is the easiest eDiscovery software I have ever used’
‘Secure, easy-to-use and a great review tool for consideration’
Lexbe eDiscovery PlatformAsk Us More About
● The Lexbe eDiscovery Platform, our cloud based processing, review and production tool. Attorney/staff DIY, no users fees or case fees.
● Our high-speed/high-capacity eDiscovery services, and expert professional services.
● Consultations, price quotes, demos and free trials available.
A Lawyer’s Guide to Faster Document Review & Production | March 16, 2016