Data Recovery Articles
When data is lost the most important question is: Is the data still retrievable? This answer depends on what action needs to be taken, whether to pursue the data recovery or to develop strategies of coping with the data loss.
The loss is often not easy to measure. Sometimes it is not fully clear what caused the data loss in the first place. Some technician might have already tried to solve the problem. Also, the effect of common remedies, such as Microsoft's "Checkdisk", on the recoverability is quite unknown. This article is supposed to sort out what is possible and what is not. Data Detect will try to explain, for a given example, what can be expected. For this purpose we must restrict the idea of "recoverability" to "commercially available and affordable data recovery". While it might be possible that the magnetization that once constituted the data is still present on the media, often times the technology to recover the data for an economically reasonable price is not available.
The Type of Data Influences the Success Rate
We must also consider the kind of data that needs to be recovered. Just assume you are able to recover a hypothetical 90% of all lost files. If these files were pictures you can consider this rate a success, you got 9 out of 10 pictures back. If your files were tables of a database and 10% are missing, the entire database will probably be worthless because the data depends on each other. The more the data is depending on each other, the greater the devastation will be if even a small percentage of data is missing. We will also will look at what "90% recovered" really means. Another interesting aspect will be the "time dimension": a data recovery is usually worth less with each day, sometimes each hour, that passed by.
Physical and Logical Data Recovery
We need to distinguish between two different procedures:
- Extracting the raw data from the affected media (physical data recovery)
- Reconstructing the files (logical data recovery)
You can have pure logical data losses. For example, file deletion, drive formatting or virus attacks only require logical reconstruction (2). On the other hand, a mechanically failed drive that is successfully repaired (1) will not need logical reconstruction.
In reality many physical problems will need subsequent logical reconstruction because not all data has been retrieved.
Dead Hard Drive
A drive can be considered "dead", if it is not accessible by the operating system or by the system BIOS, Windows' Disk Management or Mac utilities. A dead drive often shows additional symptoms. It does not spin or it "clicks" or it makes other kinds of strange noises.
Theses drives might have a damaged electronic board, damaged read heads, a damaged motor or damaged magnetic media. Data Detect (class 100 certified clean environment) can often resurrect the drive by exchanging the damaged parts. We will then image the drive and perform a logical file reconstruction.
This approach is sometimes successful and then well worth the cost of several hundred or even thousands of dollars; however, sometimes it is not successful.
Physical recovery is not always possible
First, success depends on the extent of the damage. It is not possible, even in theory, to recover data from a platter that was heated up to "Curie temperature" (which is 770°C for iron). This temperature completely demagnetizes the platters. It seems doubtful whether anybody will recover data from a drive that fell on a hard floor. If the platters are unbalanced due to bending or impact they will vibrate while spinning. If the vertical amplitude of this vibration is larger than the distance the read head flies at (50µm), the drive will sustain a permanent head crash making reading the magnetic information impossible and further destructing the surface. Horizontal vibration will make it impossible for the head to stay on the track, which is thinner than 1µm. While we know that tire shops apply weights to the wheel in order to balance the tyre, a comparable technology for unbalanced platters is unknown.
The only technology possibly capable of overcoming this problem is Magnetic Force Microscope (MFM) photography, since this technique does not require the platter to spin. However, MFM requires scanning the whole surface of the platter. The MFM moves from region to region, each region yielding a picture. This alone will take several months. Then all these pictures must be stitched together. A 20GB hard drive consists of 160,000,000,000 bits, probably 300,000,000,000 bits including overhead. Each bit is represented by a magnetic flux change. A picture displaying this flux change will probably use 100 bytes, thus inflating each bit by factor 1000. You will have to analyze the amount of 40 Tera byte of data. It is unknown if this technology is in use. It certainly is not "commercially available and affordable" because a data recovery would cost hundreds of thousands of dollars.
Second, success also depends on the drive type. Many data recovery companies can "do" certain drives but cannot do others. Modern drives are conditioned after their assembly to work perfectly with the parts built in, heads, platters etc. It is often not possible to use parts of another drive, even if both drives share the same model number.
There are no "magic" machines that are capable of recovering the data from any kind of drive. If the raw data can be retrieved, a subsequent logical reconstruction of the files must be performed.
Drives With "Bad Sectors"
These drives are still recognized by the BIOS, but they have read errors in one or more spots. After obtaining a drive image by cloning it will need to be reconstructed for the files to be recognized.
Data Detect always make an image
It is not recommended that you try this as it may further degrade the drive make it more difficult and more expensive to recover from. Instead, you might give it to a data recovery service company.
Attach external drives directly to your motherboard's IDE port
If you have an external drive, for example a USB drive, you should remove it from its case and attach it to the computer as a second (slave) hard drive. Also, you should use the IDE ports on the motherboard, not the IDE ports on an additional PCI card. This way you can see whether the drive is recognized in the BIOS.
We scan your drive or an image of your drive and try to put together the original state of all files in the file system. We are able to do this, even if certain file system structures are missing, such as partition table or boot records.
Let's examine in detail how FAT is recovers a file:
A file in a FAT file system is completely described by
- Its directory entry
- Its entry in the File Allocation Table (FAT)
- The allocated clusters on the drive containing the content of the file
The directory entry is picked up during the initial scan of the drive when examining each single sector. It contains the file's name, size, date, time and the first cluster of its data.
The first cluster directly points to the initial cluster that is allocated for the file. It also points to a FAT entry that describes where the remaining parts of the file can be found. As it turns out IMG_2379.JPG uses the clusters 4-529.
Information about a file in a FAT file system is spread among three different locations. The directory entry contains its name and where on the drive the file begins. The FAT knows where the file continues. Finally, the allocated clusters contain the file's content.
Using this information the files are able to be reconstructed.
Because information about a file is stored at three different spots it will obviously cause problems if any of these are missing or incomplete.
FAT Recovery Matrix:
A very common situation, caused by file deletion, format or partition deletion is a missing FAT entry. As long as the file size is smaller than the cluster size (e.g. 32 KB, depending on the drive size) you will get a perfectly recovered file because you actually do not need the FAT entry.
When the file is larger, it is usually allocated in consecutive clusters. This is why the most promising strategy for a data recovery software is to assume consecutive clusters when it rebuilds the file without FAT entry. This works for most of the files but runs into problems for files that increase over time. These files will necessarily be fragmented if they cannot be allocated consecutively because other files meanwhile use these clusters. Sadly, many important files fall into this category: Email files, databases, large documents and directories.
Several techniques are used in order to recover even fragmented files correctly. These techniques include taking the allocation of other files into consideration. Fragmented directories are also capable of reassembling however they may fail for large and heavily fragmented files.
As annoying as it is, although their content is still somewhere on the drive, these files are unrecoverable.
There is no automated data recovery software available that can solve fragmentation satisfactory. If one wants to recombine a file consisting of 10 clusters on a 20 GB drive one must analyze, given a cluster size of 32 KB, all possible combinations of one known cluster with 9 other clusters out of possible 625000. There are 6250009 possible combinations, a number with 52 digits.
The only possible and more intelligent approach is a "manual" data recovery for a particular file. One would begin at the known cluster and search downward looking for data you know belongs to the missing part of the file. Finally, all your findings are put together into a new file. The limitations of this approach are obvious. This can only be done for a couple of files with a known content.
** Lost Files
If the directory entry was lost but the file content is still on the drive one might be able to recover the file if one knew where it is located. This problem is different from the fragmentation problem. One does not know the name, size and start of the file. This happens if the original directory entry was re-used by the operating system while the file content was left unchanged. If one formats a drive and puts 500 MB of a new Windows OS on the drive, the first 500 MB of the drive, including a lot of directory information, will be destroyed, although the files themselves might still sit in locations beyond the overwritten 500 MB.
Data Detect is capable of recovering many file kinds whose directory entries were lost. We call them "lost files". When examining each sector while scanning the drive comparisons of each sector is sent to a list of user defined file signatures. For example, a Word document begins always with the bytes d0-cf-11-e0. If the file does not find a matching directory the entry for this signature it will create a "lost file". This way lost DOC, JPG, BMP, ZIP and other files are recovered.
*** File's Allocation Was Overwritten
If the file's allocation - as in the four bottom cases in the recovery matrix - had been destroyed or overwritten by other data, there is no possibility at all to recover this file. Once overwritten, it is unfeasible to retrieve the information that was originally stored there. Theoretically one might be able to read the "rest magnetization" with an advanced technology such as MFM (Magnetic Force Microscope), but it is unknown if anybody can actually do this. Certainly, if this technology exists it is not "commercially available and affordable".
Logical Reconstruction for NTFS
As we will see, NTFS is the better file system when it comes to data recovery. Usually, there is NO problem with fragmentation.
How file in NTFS is recovered:
- Its MFT (Master File Table) entry
- The allocated clusters on the drive containing the content of the file
The MFT entry is picked up during the initial scan of the drive when examining each single sector. It contains the file's name, size, date, time and the clusters occupied by its data. Other than the directory entry in a FAT file system it contains the complete list of the used clusters, called "run list".
The run directly points to the clusters that are allocated for the file. It turns out IMG_2379.JPG uses the x1F5 (501) clusters beginning at cluster x3F02 (16130).
Information about a file in an NTFS file system is spread among two different locations. The MFT entry contains both, the file's name and where on the drive it is allocated.
This information is used to reconstruct the files. Note that in NTFS, other than in FAT, one does not have a fragmentation problem. As soon as there is an MFT entry one knows exactly where the file is allocated. This will yield better data recovery results for fragmented files.
Because information about a file is stored in two different spots it will obviously cause problems if any of these are missing or incomplete.
NTFS Recovery Matrix:
* Lost Files
If the MFT entry was lost but the file content is still on the drive one might be able to recover the file if you knew where it is located.
The name, size and allocation of the file is not known. This happens if the original MFT entry was re-used by the operating system while the file content was left unchanged. For example, if one formatted a drive and put 500 MB of a new Windows OS on the drive, the first 500 MB of the drive, including many MFT entries, will be destroyed although the files themselves might still sit on locations beyond the overwritten 500 MB.
Data Detect is capable of recovering many file kinds whose MFT entries were lost. They are called "lost files". When examining each sector while scanning the drive it compares each sector to a list of user defined file signatures. For example, a Word document begins always with the bytes d0-cf-11-e0. If a matching directory is not found the directory entry for this signature it will create a "lost file".
This way lost DOC, JPG, BMP, ZIP and other files are recovered.
** File's Allocation Was Overwritten
If the file's allocation - as in the two cases below - had been destroyed or overwritten by other data, there is no possibility at all to recover this file. Once overwritten, it is unfeasible to retrieve the information that was originally stored there. Theoretically one might be able to read the "rest magnetization" with an advanced technology such as MFM (Magnetic Force Microscope), but it is unknown if anybody can actually do this. Certainly, if this technology exists it is not "commercially available and affordable".
Data Recovery From an Image After a Physical Problem (bad sectors)
When Data Detect recovers data on an image obtained from a physically damaged drive, we will usually get good recovery results, assuming this image contains only "some" unrecoverable sectors.
Several factors contribute to this optimistic outlook:
- A drive with bad sectors is usually not altered too much by user attempts to "fix" the problem.
- If it is a FAT drive, the file allocation table and its copy are still there.
- Most file system structures are available.
Of course, success depends on your ability to obtain this image. Files that were allocated in the damaged portions will be damaged after the recovery.
Data Recovery After Deleting/Recreating a Partition
When a partition is deleted, only the partition table and the boot record are affected. Important structures, such as MFT and FAT are usually undamaged.
Even recreating the partition - as long as the volume is not formatted - it should not alter important data structures.
Data Recovery After Formatting
In FAT, formatting a volume clears both file allocation tables and deletes the root directory. All data is still there, but you have lost:
- All entries in the root directory. Files can only be recovered as "lost files". Sub directories of the first level will have only numbers instead of their original name. Sub directories of deeper levels show their original name.
- The file allocation tables. This will cause the "fragmentation problem"
Within the limitations above, one will get a "fair" data recovery. Most files should be uncorrupted. You will need to look for your files in the numbered directories. Fragmented files, such as Outlook email files or databases, will be corrupted and probably unusable.
In NTFS, formatting a volume creates a new MFT. However, this affects only the first 25 or so entries. It usually does not touch the MFT entries of previous user files.
That means you can expect a "good" data recovery. Almost all files should be correctly retrieved.
Your results will be even better when you formatted a drive that was previously FAT-formatted with NTFS or vice versa. In this case the original FAT or MFT will probably not be damaged because these structures are located at different areas on the drive.
Data Recovery After Installing a New Windows Operating System
Here's where the trouble really begins. Installing a new OS easily overwrites 2 GB.
All files that were once located in these 2GB will be irrevocably lost. Also, directories entries (FAT) and MFT entries (NTFS) located there will be lost, leaving only files without reference ("lost file"), that had been allocated in undamaged areas beyond the 2 GB.
In FAT this will also destroy the FATs, thus causing the fragmentation problem.
As explained before, a technology capable of recovering data from "rest magnetization" is not commercially available. All you possibly can recover will come from the not overwritten area.
Let's suppose you originally have a 20 GB FAT-formatted hard drive with 10 GB used for 50000 files in 2000 directories.
A new OS of 2 GB is installed on the drive.
- 10% of all data on the drive (2 of 20 GB) is lost.
- 20% of your data on the drive (2 of 10 GB) is lost.
- 20% of your files that were allocated in the overwritten area is lost.
- Because most of the directory entries are located in the first 2 GB an additional 30% of your files are lost.
- An additional 10% of all files due to fragmentation and overlapping between the two areas may also be lost.
40% will be able to be recovered of your files undamaged. 10% will be partly recoverable and possibly half of the "lost files", 15% will be recovered without file name.
If your files on the drive were "depending" on other files, e.g. tables for databases, this number drops even further:
- If the files depend on each other pair-wise, e.g. you have one Word document for "Contracts" and one for "Appendixes", only 0.4*0.4*100 = 16% of all pairs (projects) are left.
- If their are projects of 5 files each on the drive only (0.4)^5*100 = 1% of these projects are recovered without damage.
Note that, depending on the kind of data, losing 10% of the raw data can cause the loss of 99% of the projects.Previously NTFS-formatted drive prospects are brighter:
- Less files are lost due to fragmentation and overlapping, only 5% instead of 10%.
- Less MFT entries are lost than you would lose directory entries in FAT, 10% instead of 30%. NTFS tends to spread the MFT across the drive.
- 65% of your files are recovered undamaged.
- 42% of your 2-file projects are recovered, almost three times more than with FAT.
- 12% of your 5-file projects are recovered, twelve times more than with FAT.
Data Recovery After Imaging or "Ghosting" a Drive
The consequences of imaging over a drive, for example with Norton's Ghost, are similar to the ones faced after installing a new OS on it. If the image was quite large, chances of recovering a lot of files are pretty slim.
Data Recovery After Deleting Files
Although seemingly easy, this is tougher than recovering files from a bad sector drive or after an Fdisk or format.
File deletion is the least understood topic.
Ironically, what makes data recovery of deleted files so hard is the fact that the user can still work with his drive. Attempts to recover the just deleted files often ruin his chances.
Let's have a look at how the operating system deletes a file.
File Deletion In FAT
A single file gets deleted by:
- Marking the directory entry with E5
- Freeing the associated FAT entry
Whole directories are deleted by:
- Marking the directory's directory entry with E5. The files' directory entries inside are usually left unchanged.
- Freeing the FAT entries for both, the directory and the files inside.
After deletion there is a possible fragmentation problem because the allocation information, which is stored in the FAT, is irrevocably lost.
File Deletion in NTFS
A file gets deleted by flagging its MFT entry as unused. The MFT still contains the files allocation.
The Recycle Bin
What is described above applies to the permanent deletion of files. If files are not deleted permanently they are moved to the "Recycle Bin" and can be recovered from there.
While moving the files to the Recycle Bin they get renamed to numbers while keeping their extension. For example, "My vacation.DOC" will get a new name like "D24.DOC" in the Recycle Bin folder. This does not matter as long as these files are still in the Recycle Bin. The OS will provide you with the correct name when you choose to undelete these files.
If the Recycle Bin is emptied, the deletion processes described above are carried out for these renamed files. If "My vacation.DOC" is required to be recovered the unknown file name with the extension "DOC" has to be sought.
Chances for successfully recovering deleted files
The locations of the deleted files are not protected by the file system anymore. Those locations might be recycled the next time the OS creates a new file. That's why it is such a problem, if the user continues to work with the affected hard drive.
Files are created all the time. Processes write log files. Therefore, even booting and running Windows from the affected drive can overwrite the critical areas. Browsing a single Website downloads multiple pictures to the local hard drive.
To protect those deleted files, the user must stop working with the drive immediately and connect it to another computer as a second (slave) drive. We've tested how long a deleted file was recoverable before the OS recycled the deleted file's directory entry, MFT or allocation. It happened almost instantly. This leads us to be very pessimistic about the prospects of recovering "a couple" of deleted files.
Whereas, if you deleted - let's say 1 GB consisting of 1000 files - and do not continue to work with this drive, chances to recover most of these files are pretty good. If you work with FAT, you will possibly face the fragmentation problem.
Recovering Data In Time
The time dimension often gets underestimated when it comes to data recovery. Losing data for a week can be as bad as losing the data forever.
Data recovery is a delicate and complex business, but two things are certain: the first recovery attempt is the best opportunity for success; and the recovery company you choose, can greatly impact the outcome. Data Detect's reputation as one of the most trusted and respected companies in Australia is founded upon our commitment to provide secure, fast, accurate data recovery, and outstanding customer service.