Close this search box.

Five Ideas To Double The Performance Of Your eDiscovery Environment

How To Do It For Little-To-No Additional Cost

As an eDiscovery consultant, I’ve managed and optimized hundreds of client environments. I’ve been fortunate to work with some truly talented technologists. After all these engagements, here’s what I’ve learned. The pressure to perform in this industry is incredibly high. I see organizations struggling with two core questions. First, how do we do more with less, given that budgets and resources are limited? Second, how do we handle a large new project that could overwhelm our existing capacities?

Both questions are about eDiscovery performance. Here’s the good news, as I see it; in virtually every environment I’ve assessed, there have been opportunities to dramatically increase performance for nominal additional costs. In this article I’d like to share five ideas with you that could really help you do more with less. In fact, it’s entirely possible that these ideas could double the performance of your eDiscovery environment for zero additional dollars.

Who Is This Counsel For?

Law firms, eDiscovery service providers, major accounting and consulting firms, government regulators and even corporations often own and manage eDiscovery environments in-house. While there are substantial differences in their service offerings and business models, all of these types of organizations want speed, throughput and reliability from their environments. If what I’m about to describe sounds like your organization, these ideas could be exactly what you’re looking for:

  • Overall, you believe your eDiscovery environment should be able to perform much better than what you’re seeing today.
  • You’ve made a large capital investment in systems but have not yet realized the performance improvements you were hoping to see.
  • Your technology’s unpredictability causes you to struggle to accurately forecast the time required to complete and deliver against production deadlines.
  • Your organization historically struggles to hit hard deadlines and SLA thresholds.
  • You operate eDiscovery in-house and manage a high volume of client and in-house data.
  • You’ve experienced a system failure (outage) with detrimental financial, reputational, and human consequences.
  • You’re concerned about your capacity to handle a large new case.

Before we look at my five ideas, let me quickly describe what the word performance means to me:

  • Speed. This is about systems running at optimal speed throughout the eDiscovery lifecycle, but particularly during Processing, Culling, Analytics, Review and Production.
  • Power. This is about the environment hitting periods of high-intensity workloads without being overwhelmed, slowing down, crashing and causing platform outages.
  • Scaling Capacity. This is about all systems, not just storage, easily handling peaks and valleys of workloads without being pushed to red-line status.
  • Reliability. This is about your confidence in the systems working day in and day out in a way that is consistent with your expectations.

If that sounds like your organization and what you’d like to achieve, I believe these five ideas could really help:  

  1. Get clarity about your environment’s current state.
  2. Provision virtual machine resources based upon workload requirements.
  3. Understand agent activity on your machines.
  4. Avoid SQL resource starvation.
  5. Conduct capacity planning to enhance environment predictability.  

Let’s take a closer look at each of these ideas.

Get Clarity About Your Environment’s Current State Capabilities

The starting point for doubling performance is documenting current state. This is a relatively straightforward exercise, yet one I see most organizations not engaging in. Why do you need to do this?

  • You need baseline performance metrics to help you document your current capabilities so you can define and recognize “poor” performance.
  • You need documentation about all components within the system so you can easily spot obvious problems. There are several components within eDiscovery environments that could produce slow-downs, weak performance and outages. If you can compare actual performance, by way of log files and the like, with the manufacturer’s projected performance, you just might be able to narrow down the cause of your problems. This can expedite resolutions.
  • You want to be predictive about future-state performance so your efforts are focused. For example, if your current Processing performance yields 2.5 Gigabytes (GB) per hour per worker but your expectation is to be at 5 GB per hour per worker, you know where to focus—enhancing Processing throughput.
  • If you can spot problem areas, you can engage in intelligent fixes that might not cost you anything. The default solution we see organizations leaning toward, when there are performance issues, is throw more hardware at their problems. In some instances, that’s necessary. But in many, many other instances, that won’t really solve your problem.

I recommend that you document performance in two areas: systems and throughput (the actual amount of work you’re getting complete). For systems, I recommend this type of analysis:

  • Physical Servers. How many servers do you have? How old are they? What are their specifications for CPU and RAM?
  • Storage. How many systems do you have? How old are they? How much unused capacity do you have today? What are their specifications? We often find that the types of storage organizations use can significantly impact cost and performance. We recommend a tiered architecture that would likely include high-speed (flash) systems and lower-speed systems, based on the tasks they need to handle.
  • Virtualization specifications. How many virtual machines are you running in total? How many are you running per physical server? How many physical CPUs are available versus allocated (virtual CPUs) on each host? Have you, perhaps, under-invested in the physical requirements necessary to support your virtual infrastructure?
  • SQL databases. How many SQL databases are in your environment today? How many SQL servers do you have to support them? Do you have the right balance between SQL Standard versus SQL Enterprise licenses? We often find that organizations do not engineer their SQL environment to support their actual work-flows.
  • eDiscovery application. What application portfolio is in use and how is it used in your workflow? How many licenses do you have? Which platform did you choose to go with? What are your current utilization trends? How does your actual performance compare to the application vendor’s benchmarks?
  • Network. Is your eDiscovery environment segregated from your general IT environment? I recommend that you document your network systems and design. There are many tools that will allow you to create network maps quickly and cost effectively. I recommend that you also document the technical specifications of key network devices, primarily switches and routers. Network configuration also can have a substantial impact on your security posture.

To document throughput, I recommend these types of analyses:

  • Average daily and weekly Reviewer productivity. What is the activity of your reviewers on a daily and weekly basis? How many concurrent reviewers will your environment support and do you see a degradation of performance when multiple reviewers are working simultaneously?
  • Processing speeds. How long does it take to process 100 GB of data? How many people are involved in the process? What datasets are you leveraging to capture this benchmark and does this align with the make-up of datasets your team receives for productions? We often see this benchmark being set against a completely different dataset type than Productions, rendering the benchmark irrelevant.
  • SQL database performance. How fast are your SQL databases today? How long does an average query take to produce a response? What is the average time to load workspaces? Is your storage performing fast enough to allow the application to be highly responsive?
  • CPU availability and utilization. How taxed are the CPUs in your servers and review platform computer systems? Are you currently over-committing your CPU resources (allocating more CPU to machines on the server than the server itself has available)? Does the hypervisor ever prevent tasks from being scheduled simply because system resources are unavailable?
  • RAM availability and utilization. How much RAM does your environment have today? Do you have enough RAM to support your environment? We often find that is not the case. Is your memory properly allocated to memory-hungry tasks and processes? Are you currently over-committing your RAM resources (allocating more RAM to machines on the server than the server itself has available)?
  • Size of largest matter. What is the size of the largest matter you’ve been able to handle, as measured by the document table size? Is it 1TB, 500GB, something else? How did it go handling this large matter and what lessons did you learn?
  • Average matter count. How many matters can you reasonably handle today? Are constraints from people, process, or technology? I recommend that you review average matter count by week, month and year. This will help you understand your current capacities today.
  • Storage Utilization. What is your data footprint growth rate? When must you expand storage, or remove data from production to maintain healthy capacity? Are the most active matters sitting in the appropriate tier of storage?
  • Storage Capacity Management. How long are inactive cases residing on your production storage? What are your data governance SLAs? Are your archive and restoration times for cases in “cold or nearline” storage in compliance with your contractual obligations?

This first step is essential to establish an accurate picture of your environment’s current state. This analysis will be helpful in targeting areas to address to improve performance. It will also help you take full advantage of the other points of counsel in this article.

Provision Virtual Machine Resources Based Upon Workload Requirements

If you want to double the performance of your current eDiscovery environment for little to no cost, I recommend that you pay close attention to your virtual machines (VMs). These are software-based servers that emulate the performance of an actual physical server. Why do I recommend this?

When a client tells me that their environment is experiencing sluggish performance and suboptimal speeds, I first examine the utilization rate and resource footprint of their VMs. The findings rarely surprise me. In my experience, 90% of eDiscovery virtual machines are configured using a standard IT practice called “over-committing the host.” This is the number one culprit for poor performance at the virtualization layer of the platform. In many VM configurations, over-commitment is the default setting out of the box. This is problematic for two very important reasons:

  • VMs consume resources differently based upon their task, type, quantity and the evolving demands of practitioners.
  • VMs, when using default IT configurations, often compete and cannibalize resources, degrading eDiscovery platform performance.

If you’re cannibalizing your resources, you essentially have two options today: scale out or scale up. In the scale up approach, you secure additional resources for your underlying hosts. This is about deploying fewer, yet more powerful, machines. 

In the scale out approach, our recommended approach, you deploy more physical machines with fewer resources per machine. This allows each machine to handle a finite work-load. We find that this produces overall better performance because no one physical machine is being over-tasked. This allows for more effective load balancing across the eDiscovery environment. Clients who’ve done this have experienced up to a 100% uptick in compute efficiency. Moreover, the heightened performance is immediate and lasting. So how does this work?

VMs consume the resources of an actual physical server, particularly CPUs and RAM, based on their allocation. But not all VM functions require the same allocation and so do not need to be consuming resources at the same level. For example, if you have a VM involved in Processing, it will require substantially higher allocation of the physical server’s resources than say, a typical agent server (please note that some agents or processes require just as many or more resources as Processing). But if other VMs on that same physical machine are configured to consume the same number of resources as the VM that is Processing, it just might under-perform.  

The problem with the default settings in VMs is that they then consume resources as if they were involved in heavy-compute taskseven if they are not. If you have 10 VMs on one physical server and all 10 VMs are configured as if they are doing heavy work, they will try to pull more compute power than the physical server has available. That will produce sluggish overall performance, and unpredictable behavior as random VMs win the battle for resources.

Understand Agent Activity

If you want to substantially increase performance without buying a bunch of new technology, I recommend that you closely analyze your use of agents. These are features or functions that sit in the middleware of most eDiscovery applications. Agents deliver a lot of benefits and can handle thousands of different tasks within a work-flow.

In our experience, two agents in particular can really degrade performance if they are not properly configured: Search and Conversion (near-native rendering). They’re powerful, but they also can consume a lot of precious and limited resources. When they are on the same Virtual Machine and launch at the same time, the results can be disastrous.

My recommendation is that you DO NOT pair any of these agents on the same VM. In fact, I recommend that you balance all of the agents for your primary eDiscovery application across numerous VMs to ensure the most efficient operation. There are two reasons I say this:

  1. Similar to the way Virtual Machines compete for physical resources, these agents attempt to consume ALL available virtual resources and this undermines the performance of one or both agents. This means that, right in the middle of an important review, neither agent will function properly.
  2. The likelihood of machine or complete system failure increases exponentially when agents are not balanced appropriately across multiple machines. This seems to happen right in the middle of an active and heavy review.

Fortunately, there’s a simple solution to completely avoid these detrimental consequences. Move your Conversion and Search agents to their own dedicated VMs. This segregation means that agents no longer compete for the same resources and can be launched in tandem. Once you implement this configuration tweak, your organization will likely realize immediate financial and operational gains. Reviewers will increase the volume and speed of matter execution. Additionally, you curb the likelihood of VM performance degradation and costly outages. 

Avoid SQL Resource Starvation

If you want to substantially increase performance without a lot of cost, I recommend that you analyze the memory usage of your SQL databases. Why do I recommend this? In nearly 99% of eDiscovery environment assessments I’ve conducted, Random Access Memory (RAM) is improperly sized to meet eDiscovery SQL database performance requirements. This phenomenon is colloquially called ‘SQL Memory Starvation,’ and its implications are huge, especially for reviewer speed and throughput capabilities.

In my experience, organizations typically have more CPU resources than required to properly provision their SQL servers. However, they typically do not have enough RAM. This leads to another problem as it relates to costs. Please allow me to explain this.

In eDiscovery, SQL databases fly through millions of rows of data to return to you the single data-point you are seeking. For SQL databases to do this, they need a lot of IO throughput. Fast disk-storage alone will not address this and neither will adding CPUs. In fact, adding CPUs could actually slow down the overall performance.

Here is how this works, from a technical perspective. Search queries are returned to reviewers based on the IO available to the system. When systems run out of RAM to analyze the data-sets, they turn to hard-disks to process the data. Hard disks are drastically slower than RAM. So if you want to really improve performance, add more RAM. 

This actually solves a financial problem too. SQL server licensing, which is the largest cost component, is based on allocated CPUs. We often find that organizations are paying for SQL servers that are not being used to full capacity simply because there is not enough RAM on the systems to optimize their performance. So if you really want to improve your performance, acquire and provision more RAM (which is often quite a bit cheaper than CPUs) and right-size the number and type of SQL servers based on your actual needs. Once you’ve conducted this resource rebalancing exercise, your organization could quite possibly realize operational gains immediately. I’ve personally witnessed increases in reviewer speed by 50%, 70%, or even up to 100%.

Conduct Capacity Planning to Enhance Environment Predictability

If you want to be ready to perform well on your next big case, I recommend that you conduct a capacity planning exercise. This requires a data-driven analysis of both your sales pipeline and your environment’s current capabilities. There are three phases to capacity planning:

  1. Document your current-state capacities. In my first point in this article, I demonstrated how to do this.
  2. Document your historical matter performance. This is about understanding the ebbs and flows of matters being handled by your team. To do this, I recommend that you analyze, over the trailing 36 months, three primary factors:
    1. How many matters did we handle, per month and per year, for the trailing three years?
    2. What trends can we see about growth in matter count and matter size? In other words, were you handling 10 matters per month with an average of 25 GBs per matter? Were you handling 100 matters per month with an average of 50 GBs per matter? What’s important here is to recognize trends because that can help you plan for the future, so you’re not caught off-guard. And, most importantly, how did those matters grow on average by GBs over the first 3, 6, 9 and 12 months of your team taking them on?
    3. How long did it take us, on average, to conduct a first pass to Production on each matter?
  3. Analyze the pipeline of what you can see today for new matters about to enter your environment. Add to this a reasonable projection of what you think might happen over the next 24-36 months.

All of this activity will help you understand what your actual capacities are today and, therefore, what you would need to provision quickly should you exceed those capacities—especially if a great big new matter were all of a sudden introduced into your environment.

One of the biggest mistakes I see organizations make is to take on a matter that they’re really not prepared to handle. This can lead to all sorts of negative outcomes such as:

  • Overwhelming existing staff and technology resources.
  • Missing deadlines and all of the associated fall-out.
  • Negatively impacting existing matters that might be delayed so you can focus on the large matter.
  • Reputational harm with the client who brought you the large matter—especially if they are a long-term client that you want to retain.

You can avoid all of these outcomes with a capacity planning exercise. The information gleaned from this analysis can provide an essential vantage point from which your organization can proactively:

  • Construct departmental budgets based upon technology and business requirement forecasts.
  • Procure and provision hardware and software based upon budget and sales parameters.
  • Manage client expectations regarding SLAs, engagement scope and delivery timetables.
  • Build an emergency action plan to rapidly beef up resources when the new matter comes in. This will ensure you are putting your money toward the equipment that will most likely give you the performance you need now and in the future.

I recognize that capacity planning for eDiscovery environments isn’t an exact science. However, in my opinion, any exercise that positions key stakeholders to make informed decisions based upon resource alignment is a worthwhile endeavor. I liken lesser alternatives to throwing darts at a dart board with a blindfold on. Please don’t make this mistake.

How To Make This Actionable

This thought piece is a direct response to two very important questions for stakeholders responsible for the eDiscovery function. First, how do we do more with less, given that budgets and resources are limited? Second, how do handle a large new project that could overwhelm our existing capacities? My recommendations are that you:

  1. Get clarity about your environment’s current state.
  2. Provision virtual machine resources based upon workload requirements.
  3. Understand agent activity on your machines.
  4. Avoid SQL memory starvation.
  5. Conduct capacity planning to enhance environment predictability.

Organizations that have taken these steps have realized huge gains in performance and staff productivity for little-to-no additional costs. Most organizations that I’ve analyzed don’t need more equipment. They need a better approach to using the equipment they already own. If you have any questions about the points I’ve discussed here, please know my door is open.

Scroll to Top

Talk to an Expert

"*" indicates required fields

Your Name*
This field is for validation purposes and should be left unchanged.

We use cookies to personalize content and provide you with an improved user experience. By continuing to browse this site you consent to the use of cookies. Please visit our privacy and cookie policy for further details.

Nico Van der Beken


Former Big 4 Partner and renowned forensics expert Nico Van der Beken is a key member of our Advisory Board. Following a distinguished career assisting major law firms and corporations involved in criminal, civil, regulatory, and internal investigations as a partner at KPMG Switzerland, Nico today provides advisory services to global eDiscovery businesses. Employing his specialized knowledge in Investigations, Intelligence, Diligence, Digital Forensics, Cryptocurrency Forensics, Data Analytics, eDiscovery, and Cyber Response, Nico provides expert insights into the European market and steers strategic growth for GeorgeJon.

In an industry where knowledge is power and experience begets performance and profitability, GeorgeJon is constantly absorbing and documenting real-world solutions to proactively improve client systems. Tapping the knowledge of a 25-year industry veteran augments this knowledge base with a client-side focus and market-specific insights. A leader of Forensic Technology teams at PwC, Deloitte and KPMG, and a co-founder of the Swiss office for Stroz Friedberg, Nico aligns GeorgeJon’s proven solutions with client expectations and needs.

Nico is also the co-founder of Undecom, the first global internet search platform specifically designed to congregate investigators, forensic experts, detectives, intelligence professionals, security experts, and customers from all over the world. He holds an Executive MBA in Technology Management from the Université de Fribourg and a Master of Science in Industrial Sciences from Hogeschool West-Vlaanderen.

Amy Mejia


Amy Mejia has spent her career enhancing people operations and leading strategic HR initiatives for growing companies across a wide range of industries. She develops and evolves GeorgeJon’s HR processes and programs on a daily basis, including talent management and development, employee engagement, compensation/benefits, and much more. She is perpetually focused on helping GeorgeJon achieve ever-evolving goals by optimizing company-wide productivity and satisfaction.

Amy holds a Bachelor’s degree in English from Northeastern Illinois University, a Professional in Human Resources (PHR) Certification from the HR Certification Institute, and is a Society for HR Management (SHRM) Certified Professional (CP). She is a Chicago native and mother of two young boys.

Kaya Kowalczyk


Kaya drives GeorgeJon’s marketing strategies and initiatives. She is responsible for overseeing all aspects of marketing, branding, and communications to enhance the company’s visibility, attract target audiences, and support business growth. Works closely with the executive team and collaborates with cross-functional departments to achieve marketing goals and ensure alignment with the company’s overall objectives.

During Kaya’s 18 years at GeorgeJon, she has excelled at myriad technical and business roles, developing a comprehensive understanding of GJ’s operating model while implementing programs that nurture the sustainable growth and healthy maturation of the organization. 

Reynolds Broker


Reynolds is the primary advisor, spokesperson, and tactical right hand for the Executive Team (Founder, COO, CTO). As an innovative strategist, consultant, and implementer, he spearheads the successful execution of mission-critical projects and strategic initiatives across the organization, specializing in organizational alignment, business operations governance, and marketing/communications management. His diverse professional and educational experience is rooted in the technology, corporate finance, and government affairs sectors.

Reynolds holds an International MBA in Corporate Finance and Spanish from the University of South Carolina and a bachelor’s degree in International Affairs from the University of Georgia.

Darrin Hernandez, CPA


Darrin Hernandez is the Vice President of Finance for GeorgeJon, responsible for ensuring corporate financial vitality, including accounting strategy, cash flow, reporting, forecasting, budgeting, and legal/insurance/tax compliance. Possessing a unique background that meshes accounting & finance expertise and executive management with emerging technology initiatives, Darrin is uniquely qualified to bring stability and foresight to GeorgeJon’s financial endeavors.

Over the course of his twenty-year career in corporate finance and accounting, Darrin has established himself as an authority in tech-enabled services and SaaS businesses. Prior experience in the cyber-security, bookings management as an online marketplace, and digital transformation consulting spaces provided invaluable insights for anticipating and adjusting to the ever-changing landscape that permeates the tech industry. Being nimble, adaptable, and prepared is necessary to deliver stability for fast-growing companies, and Darrin is the man with the plan.

Darrin has a B.S. in Accounting from Northern Illinois University and is a Certified Public Accountant. He lives in Chicago with his wife and two kids.

Allison Jessee


Allison Jessee is the Chief Revenue Officer at GeorgeJon. With 20+ years of experience in sales, account management, and customer success, Allison has demonstrated a profound commitment to driving growth and success for both GeorgeJon and its customers. She delivers deep industry knowledge, strategic vision, and an endless passion for innovation to guide customers through the complexities of data ecosystems while future-proofing operations.

Allison’s expertise in sales automation, strategy and sales execution, and customer relationship management makes her the ideal leader to guide GeorgeJon’s revenue growth.

Formerly Vice President of Customer Success at GeorgeJon, Allison led a team of customer success managers dedicated to optimizing eDiscovery ecosystems and data management solutions for some of the world’s leading law firms and corporations. Her collaborative approach with the sales, marketing, and tech teams has been instrumental in developing and executing strategies that have increased customer retention and satisfaction.

Prior to her tenure at GeorgeJon, Allison was the Vice President of Customer Success and Account Management at UPSTACK, where she played a pivotal role in launching and scaling a cloud-based platform for IT infrastructure services. Her experience also includes serving as the Director of Client Engagement at HBR Consulting, where she managed a diverse portfolio of clients in the legal industry and delivered strategic and operational solutions for Data Center, Network, and eDiscovery Hosting.

Ryan Merholz


Ryan Merholz is the Vice President of Engineering at GeorgeJon. An experienced eDiscovery industry veteran, Ryan oversees our support, professional services, and security programs to ensure world-class customer experiences for our global client base.

Ryan’s service acumen and technical expertise was honed over 15+ years in the eDiscovery realm at Relativity, where he built and led customer support/success, program management and consulting teams. He led the transition of Relativity’s support organization to the cloud and evolved their approach to customer success management for service providers. He is also a passionate advocate for workplace inclusion, diversity and belonging.

Ryan has a B.S. in Electrical and Computer Engineering from Ohio Northern University and lives in the Chicago suburbs with his family. When not working, Ryan enjoys going to the theater, trying new restaurants, and walking his dogs.

Tom Matarelli


Tom Matarelli is the Chief Sales Officer at GeorgeJon. A proven eDiscovery innovator, thought leader, and community contributor, Tom’s leadership skills, global perspective and technical expertise provide deep knowledge to our global customer base. He brings 15+ years of experience in Governance Risk Compliance and Legal Technology to the GJ Leadership Team. 

Tom has held leadership roles at multiple eDiscovery technology providers, including Relativity, Vertical Discovery / Ligl, and Reveal. Starting his career as a CPA, Tom quickly moved into forensic accounting and investigations, eventually focusing on forensic technology for eDiscovery. He migrated this knowledge base to the software market, joining Relativity to build and lead their global advisory practice. He has helped law firms and corporations adopt AI-based workflows for eDiscovery, investigations, audits, and corporate compliance.

Tom holds a BA in Accounting and Marketing from Western Illinois University and an MBA in Finance from the University of Chicago Booth School of Business. He is active in the local community, mentoring Chicago Public School students and coaching little league baseball.

George Orr


George Orr is a transformational leader who informs and drives the day-to-day operations of GeorgeJon. Working in close partnership with George Nedwick, CEO, he strategizes and implements both daily and long-term initiatives for the business.

Orr held multiple executive roles at Relativity, leading customer teams focused on support, professional services, customer success, and the growth of the certified professional community throughout his tenure. Orr was an original member of the Relativity “go-to-market team” in 2007, and helped grow the company in revenue and employees (5-1500). Orr brings his operational expertise and understanding of the eDiscovery customer landscape to the GeorgeJon team.

When not in the office, George can usually be found at a Pearl Jam concert or taking on new adventures with his family.

George Nedwick


George Nedwick is the founder, owner, and principal architect of GeorgeJon (GJ). Under George’s leadership, the company has grown from an IT startup to an internationally acclaimed industry leader serving a global client base.

George is a world-class systems architect who has spent fifteen years perfecting a performant, scalable, modular eDiscovery framework that can be replicated and managed on a universal scale. Recognizing a deficiency in technical expertise, storage capabilities, and cost-effective oversight within the eDiscovery industry, George methodically built a team to address this challenge. This includes forging partnerships with hardware manufacturers (Dell), software providers, and leading industry software providers to develop best practice methodologies for optimized infrastructure, specifically designed to meet the demanding needs of eDiscovery users.

George has developed clients in multiple vertical markets, including multinational corporations, leading law firms, government agencies, consulting firms, and premium service providers. He has proven expertise in working with sensitive/classified data and is well versed in navigating complex international data export laws. George has also moved the firm into creation and delivery of proprietary hardware, specifically monitoring appliances that can be placed at client sites to allow for remote access and 24/7 monitoring of all infrastructure components.