Ontrack Data Recovery

Personal Email Information Storage - Overview and Recoverability

 

Personal Email Information Storage - Overview and Recoverability

Electronic mail, or Email, continues to expand and grow in usage and importance. Corporate users, home account users, and web account users utilize Email access for virtually all types of communication. According to one estimate, public email portals such as Hotmail, Yahoo, and Lycos add millions mailboxes every year.1 Hotmail, Yahoo, and Lycos continue to add more users to their servers. This has forced these providers to raise their user email storage limits and as long as there is the available space, users will fill it.

Demand for email message storage grows as well, particularly in the corporate environment. Many users maintain their email message storage archives similar to their documents folder. The obvious result is duplication of data and information. However, subtler problems of document and archival manageability, wasted hard disk storage space, and risk of document loss emerge.

Data loss can affect any user, therefore when a data disaster happens your recovery business solution is Ontrack. Before we explore Ontrack's solutions for Email data loss, let's first get a bigger picture of the electronic messaging world.

Impact of Email Systems in Use
According to Ferris Research, the corporate email market is roughly split between two of the largest contenders. Microsoft Exchange and IBM Lotus Notes account for about 65% of the installed seats, with other installations accounting for the remaining 35%.

Here are the average distribution numbers as of January, 2003. 2

Microsoft Exchange 49.6%
IBM Lotus Notes/Domino 14.5%
Novell GroupWise 10.9%
Netscape 4.0%
HP OpenMail (discontinued) 1.2%
Other* 19.8%

*Other products include, Sendmail, Vircom VOP, Eudora Worldmail, NIMS, Imail, Mercury, CommuniGate Pro, and Centrinity FirstClass.

Email clients follow this same trend. With this information in mind, Ontrack has targeted its efforts on producing recovery tools and techniques for Microsoft© Exchange servers and Microsoft© Outlook Personal Folder Store recoveries. But what are the types of message storage? How does this affect the recoverability of such files?

Types of Message Stores
As previously mentioned, email messaging has literally exploded into popular use. Millions of emails are sent and received each day. These emails are stored somewhere, whether that be on an email server or on the user's computer. While most users understand the delicate nature of electronic messages—how many times have you accidentally deleted a message only to find it permanently gone? Storage habits have produced an email storage dilemma. Users are not deleting old messages in case they are needed again. In fact, it is not uncommon for corporate users to receive 200 - 300 messages a day, resulting in increasingly equal numbers of messages saved daily.

While a variety of methods have been employed to handle all of this information, there is no single message store file format. All of the messaging software OEMs use different types of files to store user mailboxes and messages. For example, UNIX and Linux email client software can store messages as individual files under a specific directory on the user's home directory. On the other hand, Web based email users can access their mailboxes from any computer. In this case email messages are converted into HTML pages that are displayed but not stored on the host machine. These messages can be managed through the web account email. Usually this is a scripted web page with software-like functionality that allows the user to forward, reply, delete or store various messages.

In the corporate arena, proprietary storage architecture is used to manage user mailboxes on the email server. This type of storage architecture keeps all of the mailboxes, messages, and attachments in a storage container, preventing the email server from having to handle thousands of individual files. How does this work in the real world? If a colleague, for example, sends out a 5mb document for review to ten reviewers that would take up quite a bit of space for just the ten copies of the 5mb document and message. Yet by using a single-point-of-reference, a common technique used by enterprise email servers, each user accesses only one copy of the document and message.

Exchange Server and Lotus Domino3 employ compression and single-point-of-reference methods for attachments or messages that go to a number of people. To be able to have this type of control over the user mailboxes, Exchange, Lotus, and Novell GroupWise4 store the messages and attachments into a database that the email server controls. Microsoft calls their database the Information Store. (For more information on Exchange Recoveries see the Ontrack article, "Exchange Recoveries") Other OEMs may refer to their database as the Post Office or Message Database.

This is just a brief sampling of messaging storage file types and is by no means a complete list. We will now shift attention to Microsoft's message storage files to get an overview of how they work inside and what can be stored in them.

Outlook - Personal Folder Store
The Outlook Personal Folder Store (.PST file) is part of Microsoft Outlook. Outlook works directly with Exchange and is the client interface to a user's mailbox. Outlook has the ability to deliver and store messages in the PST file. However, the PST is also designed to hold more than just email messages. Other file types can be stored inside of the PST file. This makes the PST file a robust information storage container. Internally, the PST file is has a ‘mini-file system' that manages how data is stored.

As a result, the PST file has the unique ability to become document storage in itself. This of course goes against the old adage, ‘Don't put all your eggs in one basket.' PST files can become corrupted internally or become at risk because of other data disasters. PST files, therefore, can become inaccessible because of hard disk failure, deletion, or by being partially overwritten. Also,
PST files currently have a file size limitation of 2GB. Given that Outlook will not open a PST file that has exceeded this limit5 ,utilizing the PST file as document storage puts the user data at risk. Ontrack has developed proprietary recovery techniques proven to work around this problem without losing large portions of message data.

Outlook - Offline Folder Store
The Outlook Offline Folder Store (.OST file) is part of Microsoft Outlook and provides a unique synchronization method with an Exchange Server and existing mailbox account. The OST file, like the PST file, resides on the user's machine. However, this store is being constantly synchronized with the Exchange mailbox account. This means that the data of the store is really being duplicated; on the server and on the user's machine. Offline changes are synchronized when communication with the Exchange Server is restored.

OST files are good solutions for mobile computer users and can be configured to include specific folders. However, OST files can be corrupted just like PST files and also have a 2GB physical size limit5. Access to the OST file requires an Exchange mailbox account, and if the mailbox account is missing or removed, the OST file will not open. As with PST files, Ontrack has developed proprietary recovery techniques to retrieve the data from these situations.

Outlook Express - Message Stores
For Outlook Express users, Microsoft uses a different storage container. Outlook Express is recommended for POP3 and Internet email accounts. Outlook Express manages electronic messages but without the feature scale that Outlook has.6

The Outlook Express message store is also different and uses DBX files to represent folders that appear in the Outlook Express client. DBX message stores have their own internal "mini-file system" that Outlook Express uses to find messages and attachments. As with PST and OST files, DBX files can become internally corrupt. Ontrack has developed proprietary recovery techniques proven to work around this and other problems, successfully recovering Outlook Express files.

Data Loss Scenarios with Message Stores
Ontrack sees a variety of data loss scenarios everyday. As with any data loss, it is access to the original data, not the backup, that users are after. One of the most requested files customers have submitted for recovery is their message store. Whether the file is from Outlook, Lotus, or GroupWise, what are some of the common data loss situations with Message Stores?

Deletion Deletions can be of the message store file itself or of the internal objects, such as folders or emails.
Overwritten Overwrites can be of the message store file from a previous backup. Internal objects can be overwritten through an importation or manually moving messages with the same name.8
Re-install of OS A common data loss scenario is the re-installation of the operating system. In the case of the Windows Operating System, Explorer is integral part of the system. Outlook Express relies heavily on Explorer to define the system file locations. A re-installation can overwrite Profiles that contain the locations of these files. Windows XP has a safeguard to not allow a secondary installation with the same user name; as long as the volume is not reformatted. In this way, the user profile in the Documents and Settings folder is not lost.
Internal Corruption Message stores are complex structures internally. Due to an application error, there may be incorrect data written inside of the file. This corruption will not allow the email client to access the data correctly.
Truncated File Size due to Volume Errors Corruption can happen at a file system level. File systems, at some point, may have minor inconsistencies that will not allow the file to be correctly accessed. Volume errors may invoke the operating system's internal repair tool. These tools work to get the volume ready to read and write data. Sometimes the data is sacrificed during this repair process. If there are volume errors, the repair tool will truncate or "chop" the file. This can remove large sections of the file thereby making it inaccessible.

The Ontrack Solution to Data Loss for Microsoft© Message Stores
Losing data, representing everything from employee time on projects to company communication, is a serious disaster. When the data loss involves email containing ideas, policies, plans, financial data, business plans, and virtually any other type of intellectual property or information, any type of loss of data is extremely critical. For this reason, Ontrack goes beyond data recovery of files, working to repair the internal workings of the file so that accessibility to the original data is gained.

Ontrack has dedicated time and research to developing data recovery tools for Outlook Message Stores. These tools work on the premise of: (1) finding the data, (2) building a virtual structure within memory as to where that data is located, and (3) copying out the messages.

Currently, Ontrack has developed high quality tools for Outlook PST/OST files as well as for Outlook Express files. These repair tools are designed to provide a superior recovery and will even work with deleted folders or messages. Once data is deleted from either of these message stores, the internal file system marks that space as "free." Ontrack tools will find the pieces of the message and put together as much of it together as possible.

Ontrack sells EasyRecovery EmailRepair that incorporates the same recovery functions used in our data recovery labs. This product currently has Outlook Express mail (DBX) and Outlook (PST/OST) file recovery support.

EasyRecovery EmailRepair's Outlook Express recovery engine works to find damaged messages inside the corrupted DBX file. The software is designed to scan quickly through the file, and resolve each damaged message that is found to verify if it's valid. The EasyRecovery EmailRepair Outlook Express recovery engine will extract the data found into a native DBX format, making importing the data back into Outlook Express very easy.


Here is a reference grid of Ontrack's Message Store recovery abilities:

Message Store Recovery Abilities
Email Solution Recover lost or deleted files Recover individual mail items from Outlook database file (pst, ost) Recover individual mail items from Outlook Express database file (dbx) Recover individual mail items from Exchange server database file (edb) Recover individual mail items from Lotus Notes server database file (NSF files)**
ER Lite
(up to 25 files)
NO NO NO NO
ER DataRecovery NO NO NO NO
ER FileRepair NO NO NO NO
ER EmailRepair NO NO NO
PowerControls NO NO NO NO
DR Service
RDR Service

** This type of recovery focuses on getting back the files themselves and does not go through the internal structure of these files.

Your client's data is their most valued asset. When it is lost due to a data disaster, Ontrack is your business partner for success. Ontrack is constantly working hard to provide cutting edge recovery solutions to all types of data loss circumstances.

As long as the data can be found, rebuilt, or repaired, your client's data loss will be minimized. Whether data loss is from a hard disk failure, a file system failure, internal file corruption or human error, Ontrack has proven proprietary processes and techniques that yield the highest quality recovery results.

References
1. "Message Store Content Optimization: Cost Savings for Large Scale Messaging Environments" by Ferris Research, © February 2001
2. "Corporate Email Issues Part I: Systems and Usage" by Ferris Research, © January 2003
3. IBM Lotus Domino Server 6 - IBM Lotus Domino 6 - Server Scalability and Performance
4. Novell GroupWise Databases - Information Store in the Post Office
5. "PST and OST Files Stop Accepting Data" - Microsoft Knowledge Base Article
6. Outlook vs. Outlook Express - Microsoft Knowledge Base Article
7. How to Integrate Old Message Stores - Microsoft Knowledge Base Article
8. How to Import Outlook Express Message Stores - Microsoft Knowledge Base Article

© 2003 Kroll Ontrack Inc. All rights reserved. Not to be reproduced without this notice.