If you’re familiar with Exchange, you know that when it comes to storage and performance, you need to make sure that Exchange doesn’t compete with other disk-intensive applications like Microsoft SQL. Fortunately, there is an Exchange JetStress tool that can help you out (we’ll get to that a bit later). With each version of Exchange, Microsoft releases an updated Exchange Calculator where you can plan your storage based on specific criteria. The Exchange Calculator is an Excel spreadsheet that allows you to enter values as shown below:
To download this calculator, go to this Microsoft web page.
You will notice that this is for Exchange 2019. Each section allows you to enter a value or select a value. Depending on what you select and type in, it then fills in the tabs at the bottom, such as:
- Role requirements
- Volume requirements
- Backup requirements
- Replication requirements
- Modeling mailbox space
- Storage design
- Version changes
Not a one-size-fits-all approach
Each environment will have different requirements. There is no one-size-fits-all approach. For example, your organization might want four copies of the data, which changes your backup requirements and the amount of storage you need. Although creating and maintaining the spreadsheet is beyond the scope of this article, you can read more about it at this link.
Once you’ve filled it all out and you’re happy with the design and purchased your storage, you need to put it to the test. Microsoft provides a JetStress tool that works with Exchange 2013 and 2016. One of the requirements is that you have the correct DLL files placed in the installation folder. There was no mention of an update for Exchange 2019, but you should still be able to validate the storage and its performance. If you have a new storage area network (SAN) dedicated to Exchange, you should be able to hard-run the tool. But if you are sharing the SAN with other applications, consider performing the Exchange JetStress after working hours so as not to impact other applications. Here is the link to download the JetStress tool.
Exchange JetStress scenarios
Once you’ve downloaded the file and performed the installation, it’s time to run the first simulation. At one of my customers, they purchased storage for Exchange 2016/2019, and we started testing in the scenario below:
- 4 databases, 1 copy – 2 threads
The tests took an hour at a time and we captured the output in an Excel spreadsheet so that we could compare the values. We then moved on to the following test:
- 4 databases, 2 copies – 2 threads
The tests again lasted an hour and we captured the result in our spreadsheet. We have moved on to the following test:
- 4 databases, 3 copies – 2 threads
The tests again lasted an hour and the exit was captured. We were happy with the databases and the copies, so we started increasing the threads. The following test had four threads:
- 4 databases, 3 copies – 4 threads
During the day, we increased the threads to 10 (still with four databases and three copies), and we crashed a controller on the storage. It was due to a vendor software bug. We continued to run the tests and ultimately crashed the entire storage unit.
The vendor had a new version, so we applied the firmware and repeated the same tests. Testing took a few days before we had a controller crash on the storage unit. After an exchange with the seller, they reproduced the problem on their own and released new firmware.
The tests went on for months until we got to a point where all the bugs were fixed, and we were able to run our tests successfully with four databases, three copies using 10 threads. The unit has been put into production. The first Exchange 2016 pair was introduced and storage was added. The Exchange 2016 storage has been configured with one copy on this SAN and the other two distributed over two other SANs.
The underlying hypervisor for a particular day failed over to the new storage and we started receiving alerts for a failed database. It turned out that if you live migrate a machine with this storage, it formats the underlying storage. Wait what? Yes, a major flaw in the hardware. It took us a few hours to get the server back and get the seeding back online.
We were able to reproduce the issue with a test machine and found that if you shut down the machine and move it, you will not experience data loss. Back to the drawing board for the seller.
Time well spent
An emergency patch has been released and applied. That brings me to this conclusion: You need to test your storage over and over again and make sure you’re happy with it before it goes into production. Our testing lasted a year because we wanted to make sure everything was covered and storage didn’t fall on us. Hopefully, what I’m writing to save you the heartache of losing production servers because you haven’t properly tested your storage.
If you are not happy with the storage, delay delivery of your project until everything is working 100% for you. Problems such as formatting disk storage can only occur in production, because Exchange JetStress does not test virtual machines for failover.
Featured Image: Shutterstock