File Organization and Back Up

I've been getting asked and have been talking about this a lot lately, so I thought I'd put it down in writing for easy learning and reference. I don't know why the "Intro To Computer" classes don't teach this. If you're an instructor and do, pat yourself on the back... if not, you need to be giving out refunds. To even effectively start using a computer, organization and back up needs to be taught. The concepts are quite easy if fully explained.

This paper deals with file system organization of private data and methods of backing that data up. This document is not geared towards operating system level imaging and back up methods. This document is geared more towards the common user and what that person should do (as power users or administrators should already know all of this). I realize there are multiple ways of doing the same thing, but I'm going to stick to more simple methods for this document.

You cannot easily do back ups without some kind of organization to begin with. It's just not practical. This is where I'll start.

The primary rule is to keep your data together. Under linux, this is pretty easy given how unix home folders tend to work. I don't know about Mac. Under Micro$oft, this is problematic... so windoze will get picked on for being the bad example.

Windoze programs have a history of making a mess with the file system. It has been getting better over the years, but it still isn't where it should be (thank you Micro$oft...you industry giant and lazy multi-billion dollar company). Historically windoze programs have saved their data in the program folder and sometimes in another data folder (often a folder off C:\). With the "invention" of the "My Documents" folder, this has progressively been getting better, but there are still some programs that insist on this structure.

Why is this bad and a poor programming practice? Mainly because finding important files for back up and in a time of need will be difficult and tedious. Considering how unmotivated so many end users are, this means that important data may never get backed up. Even if the end user is motivated, that person may have to back up many such locations and forget one. So far I'm not seeing anything positive about this model.

What can an end user do about scattered file saving? Simple, make sure you save your data in a consolidated location each and every time. With some of those older and poorly made programs, this could be obnoxious. You'll just have to get over it and do it anyways. If you don't like how they work, complain to their programmers and get them to make fixes they should have done long ago. Either that or find a better program to use.

The second rule is to keep your data organized. Consolidation alone isn't enough. If you always dump your digital camera in the same folder and you have hundreds of files with the sub-extensions of "Copy 1", "Copy 2", "Copy 3", and so on, how are you going to really know what is what?

Most people find the organization part to be the hardest. It's not. It is nothing more than a bunch of nested folders going from most general at the top of the tree to most specific at the bottom. Each folder is nothing more than a category name. Each subfolder is nothing more than a subdivision of the main category. Keep going with that until the actual files have a good place to be. This is an extremely important concept that needs to be studied until it is fully understood. Someone who does not understand this will always have problems here.

For common users, I usually endorse the use of the "Desktop" as the starting point. It's always there and easy to find. I do NOT endorse the use of a cluttered desktop where the wall paper is literally fully obscured by icons. That's not organization.

From there, start creating folders and naming them as the top level categories you want. Popular choices are shown in the tree:

Desktop
\_Documents
\_Pictures
\_Music
\_Videos
\_Downloads
\_Archive

It's pretty obvious what goes where in all of these. What is not that obvious to the end user is how each top folder should be split into subfolders. Questions must be asked about what kind of data and how much data must be stored in each.

The easiest way is to illustrate with a series of examples.

Example. If a family of 4 (Father, Mother, Brother, Sister) are going to be sharing that computer (ignoring security and privacy concerns), then there are a few immediate ways to categorize things. Each person could have a folder under each listed subfolder with their name. This would keep each data type sorted by that primary category and subsorted by the person. On the flip side, each person could have a folder by their own name and have the category list above as subfolders (kind of like each person has their own room in the house, and each room has a closet for each folder category). If Father and Mother work in a business together, they may have a joint folder that is shared (kind of like a master bedroom). It might be called something like "Business" or "Office". If the whole family has shared data, it may not need to be in another subfolder. Each method has advantages and disadvantages. Neither is fully right nor wrong. It is how the people want to use the computer that will define it. The key is to stay consistent once the structure has been defined (keep your rooms and closets clean in the house).

Example. Let's say Father has a digital camera and likes to take a lot of family pictures in addition to using it for work. "Family" and "Work" need to be split apart, so they each get different subfolders under "Pictures". "Family" then again would get more subfolders based on the content of the events. The same holds true for "Work". For folder labeling, I recommend a date, the location, and who was there. It's also good to include a text document in the folder with the pictures describing the event in more detail (kind of like a diary).

Pictures
\_Family
   \_DATE_SisterAtSchoolPlay
   \_DATE_BrotherAtSportingEvent
   \_DATE_MotherAtFlowerGarden
   \_DATE_BrothersScienceProject
   \_DATE_FamilyHikingInMountains
\_Work
   \_DATE_StormDamage
   \_DATE_BigConvention
   \_DATE_BigConvention-Seminar
   \_DATE_BigConvention
   \_DATE_BigConvention-Closing
   \_DATE_Inventory

For naming, some people prefer "DATE" first, some last. I usually prefer it first. Just be consistent and make sure it is easily searchable for the category chosen. When "DATE" is first, start with the largest unit of time and work down to the smallest (Year-Month-Day). This will naturally sort the folders in chronological order. Example: "2009-01-31" would be "January 31, 2009". Leading zero's are important in front of small numbers. Computers really don't sort anything else very well and will just end up making a mess.

Example. Let's say Mother has a video camera with the training to use it and likes to make home movies of the family. This would have a structure similar to "Pictures" but would use "Raw" and "Edited" when relating to video that has been dumped from the camera and video that has been processed into a finished form.

Videos
\_HomeMovies
   \_Raw
      \_DATE_FamilyPicnic
      \_DATE_KidsAtSchoolPlay
   \_Edited
      \_DATE_KidsAtCarnival
      \_DATE_KidsAtSportingEvent

Example. Let's say Sister has a lot of music and it's not something the rest of the family really likes. Using the Desktop tree structure example, she should probably have her own subfolder. Like everything else, music is best sorted from most generic to most specific. This means putting the music files under the "Artist" folder and into the "Album" subfolder. There could still be "Artist" folders under "Music" that the entire family likes listening to.

Desktop
\_Music
   \_Sister
      \_Artist1
         \_Album1
         \_Album2
      \_Artist2
         \_Album1
      \_Artist3
         \_Album1
         \_Album2
         \_Album3

Example. Let's say Brother is joining his parents in the family business and is doing a lot of the paper work now. Under "Documents" a "Business" subfolder is created. This "Business" folder essentially becomes a hierarchial filing cabinet. Each folder on the computer mimics a drawer or folder in the real world filing cabinet.

Desktop
\_Documents
   \_Brother-SchoolWork
   \_Sister-SchoolWork
   \_Mother-Letters
   \_Business
      \_Advertising
      \_Invoices
         \_Completed
         \_PastDue
      \_Inventory
      \_Manuals
      \_FAX
         \_In
         \_Out
      \_Planning
      \_Resumes
      \_Taxes
         \_Sales
         \_State
         \_Federal
      \_Training

Example: Downloads Folder. This folder has a loose structure and is more often used for temporary storage during file transfers. Having a subfolder called "Installs" would be good for keeping downloaded programs that the computer runs. Having a subfolder called "System" would be good for keeping hardware drivers, manuals, and updates needed to make the computer run. In the house analogy, think of it like a tool shed.

Example: Archive Folder. This folder is more for long term storage for things needed to be kept but not used every day. It helps cut down the clutter from the main "Desktop" folder structure. "Archive" should have a folder structure very similar to "Desktop". Think of "Archive" as a storage shed, attic, or garage storage in the house analogy. Things in "Archive" may make very good candidates to be moved off onto permanent optical disc storage. This would lessen the back up load somewhat.

Note. With use, folder structures may grow too big or have too many files in them. It's ok to reorganize. It isn't a big deal. Think of it as spring cleaning in the house analogy. The key is to keep the structure clean, easy to find, and easy to use.


Backing Up. If you can master consolidation and organization, backing up suddenly becomes very trivial. All of the data is in one place and already sorted. All that needs to be done is the copying off process.

Back Up Methods: Software. There are 2 types of back up methods: incremental and complete. The incremental method starts with a complete back up and then only backs up what changes in the following back ups. If little changes, this has the advantage of size and speed. The problems are that a special program needs to keep track of what has changed, and all of the incremental back ups must be kept for a successful restore. I do not recommend this method for casual home use. Complete back ups copy everything every time a back up is performed. This has the advantage of simplicity for the end user and also keeps the back ups self contained (no increments needed to restore). If compression is desired, programs that support ZIP or RAR could be used. These two formats are universal and easy to work with across many platforms. Compression is not useful with videos, music, or pictures as these are already compressed. Stay away from any program that uses custom or proprietary formats and methods. These could really bite you in a time of need.

Back Up Methods: Hardware. There are many. Some are better than others. One good method may not be so good for something else depending on the use. Some methods are a "fad of the year" and should be avoided (none listed here). For generic consumer use, I generally recommend backing up to USB sticks, to optical discs, to the local hard disk, and to a remote computer. Any or all of these options could be employed depending on the desired results.

Backing Up To USB Sticks. Usage. Plug the stick in, drag and drop the folders into the stick's folder, unmount the stick, and unplug the stick. Delete old files before the copy if needed. It's harder to get much easier than that. Thoughts. These are relatively cheap and easy to find. USB is a standard and will be around for a long time. Don't lose the smaller sticks. Multiple sticks may be needed for a large amount of data. Good for data that changes often. If used heavily, the flash chip inside the USB stick will wear out and die. Since the USB controller and flash chip are electronics, they are vulnerable to static charge and surge damage. If the chips inside the stick are damaged, there is no hope for recovery.

Backing Up To Optical Discs. These include CDROM, DVD, HDDVD, and BluRay discs. Usage. A special "disc creator" or "burning program" must be used to make an optical disc. In the old days these were a bit of a pain. With newer "drag and drop" versions, these are relatively easy to use. Thoughts. Burning an optical disc is an extra step compared to the other methods. Most discs have a good burn time between 30-60 minutes. A slower burn time will produce a better optical disc than a faster one. Multiple discs may be needed depending on back up size. Not overly great for data that changes often. Very good for permanent storage. Most retail store brand optical discs are junk and not worth the coasters they become. A good brand of optical disc (like Taiyo Yuden) is rated to last 100 years and will do so as long as the disc is maintained and isn't damaged. Keeping optical discs from being scratched can sometimes be a pain. Optical discs have no electronics and can survive static shock, electrical surge, and an EMP. Older optical drives cannot read the newer formats. Newer optical drives can read almost anything. "RW" class discs can be re-used. This is my preferred method for long term storage.

Backing Up To The Local Hard Disk. Usage. Create a new back up folder and just copy the files into it. Thoughts. This method is the fastest. This method often offers the most space. Make sure there is enough space available to hold the back up and still be able to use the hard disk. This method is only useful for keeping previous versions of the data being backed up and not useful for protecting against hardware failures. This method is good for protecting against an accidental deletion of something important. Hard disks have a relatively short life span. If the hard disk starts to fail, it may be possible to still copy off the data, but only with some real effort (be wary of corruption). If the electronics in the hard disk fail, it is likely that all data would be lost, but it may be shipped back to a recovery company to see if the platters are still readable (this is expensive). If a hard disk truly fails then the original data and back ups are lost at the same time (total disaster). Hard disks are very static and surge sensitive. Hard disks are very impact sensitive. Older hard disk controllers are phased out of newer hardware making this method unsuitable for very long term storage. A hard disk is often difficult to take from one computer to another (even with drive trays).

Backing Up To A Remote Computer. Usage. This is often done over a network share (SMB, CIFS, or NFS). The network share is mounted, a back up directory is created, and the files are copied over. Thoughts. This offers a lot of flexibility if multiple computers are on the same network and need to be backed up. If proper network security is not set up, other computers could read or delete back ups that are not their own. If a lot of data is being transferred, then a fast network should be used. This method shares nearly all the same benefits and risks as a local hard disk back up except the danger of all the data being on one disk and lost at the same time. If the back up disk is not in a drive try, it will be difficult and bulky to transport around. This is my preferred method for data that changes often.

Back Up Naming and Placement. Clarity is important for both identification and ease of restoration. I usually recommend the name to be of what was backed up and a date stamp to know when. This also helps keep back ups from getting mixed up. If many old back ups are not needed, the date stamp makes choosing the old ones for deletion very easy. For disaster and recovery planning, keep back ups far enough away from the active data (mainly in hard disk back ups) that they won't get confused with the data that truly needs to be salvaged.

Back Up Frequency. This is highly dependent on the user's needs. For important business use, this could be hourly or daily. For generic use, this could be weekly. For generic home use, this could be monthly. If there is a danger of hardware failure, this could be every time something changes. These could also be mixed if some data needs backing up more than other.

Deleting/Removing Old Back Ups. This is highly dependent on the user's needs. Obviously if a back up is too old to be useful, delete it, erase it, or destroy it (depending on the method used). Try to keep back ups that might still be useful. In a double disaster situation where the last back up got corrupted or had problems, the previous back up may save your skin. If there isn't enough space for a new back up, obviously the oldest back up will need to be deleted.

Back Up Dangers and Pitfalls. If the backed up data is very important, make multiple copies and keep those in multiple locations. For an important back up in a single location, get a water tight fire safe and bolt it to the floor. Electronics are not water proof and will be fully destroyed if submerged in a flood. Electronics are lightning/surge sensitive and can be easily destroyed in an electrical storm. A nearby lightning strike could destroy all electronics based back ups at once if they are plugged in. Optical discs have some water resistance but can be easily scratched in muddy water. Tornadoes and hurricanes may take the back ups to some place very far away and/or crush them (use a strong bolted down safe). Even though the physical security on a computer may be high, the physical security on back ups is often ignored, making it an easy target for theft. Throwing away old back ups without destroying them makes them prime candidates for dumpster diving theft. If security on the back ups is weak, they may be easily copied and nobody would know. Encrypting back ups is a possibility but is often beyond the realm of the common user (need strong keys that few people know about).