Step 2: Filenaming tips

Introductory video: Filenaming

Developing good filenaming practices for transparency and ease of management

Many people do not put much thought into the names of their digital files since most of the software programs that they regularly use do this task for them. However, if you want your data to be useful to yourself and other people for many years to come and across a variety of different types and versions of software (including the software used in digital repositories), then you need to put some thought into how you name your files. By putting a little bit of planning and effort into the process early on in your research, you will save yourself, your collaborators, and the future users of your data a lot of time, effort, and frustration in the future.

It is important that the digital files in your collection have names that will not interfere with archival processing, are unique within your materials, and are named in consistent ways to make organizing and finding materials in your collection as easy as possible. Creating and sticking to a filenaming system requires some discipline since some computer operating systems and software give users more latitude in how files can be named than others. Some common cloud-based document applications like Google’s GSuite will tolerate untitled documents or assign a title based on the first line of text in the document. Some systems might have no limit on the number or types of characters used in filenames while other systems allow only ASCII characters in limited numbers.  Often people create ad hoc filenames when sharing files, and lots of devices such as cameras and audio recorders create names for the files they make.

Filenames should be unique so that no two files have the same name. This both helps users know what the content of a file should be and helps avoid problems in file transfers or preservation processes. In general, operating systems require that each file in a directory (i.e., a folder) have a different name, and while files with the same name may be in different folders in your collection, some behind-the-scenes processes or preservation actions might put all the files into a single directory.

To make your filenames unique you can use: sequential ordering, semantic naming, or a combination of these.

In sequential order filenaming (pictured in Figure 15), files get sequentially numbered in the order that they are created. To avoid having files with identical names. cameras and recording devices often automatically make files with numbers that increment or that incorporate a time and date stamp into each filename. For example, a video camera might make the files 0001.mp4, 0002.mp4, 0003.mp4, etc.; the photograph with the filename 20191203_111325.jpg was taken at 11:13 AM on December 3, 2019. Automatic sequential filenaming is an easy option that also serves to preserve the original order in which the files were created. However, if your team has multiple people using the same kind of equipment, be aware that you might end up with duplicate filenames. 

Figure 15:

Cartoon of a woman filming a computer screen, on the screen there are images in a folder which are named according to the order in which they were taken.

Semantic filenames encode some of the descriptive information relevant to your recording context. Language documenters often are trained to use at least three relevant descriptors. Examples include a code for the language or speaker, the date that the recording or file was made, and a short description of the content. Other relevant semantic information might include a code for the researcher or the transcriber, the location of the recording, or a version or part number. What information to include will of course depend on what kinds of information are most important to your project or collection; see Figures 16 and 17 below for examples. 

Figure 16:

Graphic that shows a bad filenaming system, wherein the filename is dependent on the folder structure for identification and is too broad a name.

Figure 17:

Graphic that shows a good filenaming system, wherein the filename is unique and has a lot of identifying info.

Since in most cases filenames cannot be changed once they are archived, you should be careful about including too much personally identifying information in the names of files themselves. Names of files are usually publicly visible even if their contents are restricted, so including a person’s full name, or distinctive parts of their name, in a filename may cause difficulties if they decide that they do not want to be identified with their contributions to the collection in the future.

Sequential and semantic methods can be combined to create sequentially ordered filenames that also contain semantic information. For example, if you are making recordings in two locations, your filenames could use a different code for each location, followed by sequential numbers. For example, files recorded in New York City would be named NYC001.mov, NYC002.mov, etc. and files recorded in Newark, New Jersey would be named NNJ001.mov, NNJ002.mov, etc.

Below are some important tips to consider when naming your files. Keep in mind that some of these tips may not seem relevant to you or your workflow, but they are crucial to the workflows of the digital archive that will be preserving your files.

Figure 18:

Infographic that summarizes the 7 filenaming tips that will be covered on the next page

Complete and Continue  
Discussion

1 comments