Managing Your Files | Saint Paul University

Creating a strategy for how you will manage your project files throughout the research process is a fundamental element of your overall data management plan. A research project may include multiple files in a variety of formats, multiple versions of files, spreadsheets, images, lab notes, interview tapes, etc., that are essential to the project. Establishing good file management practices at the outset is much easier than trying to organize the work midway through the project.

Managing your project files will render benefits later on by:

Increasing efficiency
Reducing risk of loss or file redundancy
Increasing research impact by making it easier to share files
Complying with legal/ethical requirements or policies
Providing clear record of research process
Facilitating preservation at conclusion of project

Elements central to managing project files include:

Adopting and documenting folder and file naming conventions
Creating a clear hierarchy of folders
Documenting file contents
Tracking file versions
Understanding file formats used in long-term preservation

Directory Structure

When organizing your files, consider including elements such as the project title, a unique identifier, and the date (year) in the folder directory name. The substructure should include a clear, documented naming convention; for instance, each component or run of an experiment, each version of a dataset, and/or each person in the group. The structure should follow a consistent pattern that can be clearly recognizable to the entire research group.

Elements of file naming conventions (below) apply to directory folder names as well.

File Naming Conventions

Project files should be named and organized in a consistent and descriptive manner and in a way that is logical and predicable to yourself and others. Clear distinctions between files will facilitate effective and efficient file browsing and retrieval.

There are three things to keep in mind when labelling data:

Organization: Important for future access and retrieval.
Context: This could include content specific or descriptive information independent of where the data is stored.
Consistency: Choose a naming convention and ensure that the rules are followed systematically by always including the same information (such as date and time) in the same order and following the same format. (e.g., YYYYMMDD)

Consider using several of the following elements in file names:

Project name, number, or acronym
Creator surname and initials
Name of research team/department associated with the data
File version number
Date of creation
Date experiment undertaken
Description of content
Publication date

Other considerations:

Keep file names to a manageable length—preferably 25 characters or less
Do not give files the same name as the folder in which they reside
Avoid using unusual characters, such as: ! – @ # $ % ^ & * ( ) [] {}+ ? > <
Avoid using spaces. In place of spaces between words use one of the following methods:
Use a capital for the first letter of each word:
ProjectAcronymLastNameFirstNameTopic.txt
ProjectAcronymTopicOfDocumentDate.pdf
ProjetAcronymeSujetDuDocumentDate.pdf
Use an underscore between each word:
ProjectAcronym_last_name_first_name_topic.txt
ProjectAcronym_topic_of_document_date.pdf
Acronyme de projet_sujet_du_document_date.pdf

Consider using version control systems for bulk renaming of files where necessary.

If, partway through a project, there is a need to rename a large number of files to conform to a systematic file naming convention you have adopted, there are a number of tools available to make this process easier.

Examples of file renaming tools:

Versioning

Versioning, or version control, refers to the management of file revisions. Versioning assists researchers in managing data during a project where experimentation, revisions, and re-examinations are undertaken. Text files as well as data files may undergo numerous changes before the final version is set.

Versioning mechanisms, such as directory structure and file naming conventions, assist users in differentiating between different versions of a dataset and accompanying files.

Researchers should also consider discarding obsolete versions of files, but care should be taken in making decisions about future use of files before discarding. In some instances, keeping backup copies of versions may be advisable.

A number of tools are available for file versioning, including:

Apache Subversion
Git

Backing Up Files

Backing up files refers to the creation of file copies. These copies should reside in a separate physical location from the working or stored files. Arranging a regular back up schedule mitigates the possibility of data loss and backup copies can be used to restore damaged or lost original files.

Acknowledgements

We would like to thank the UK Data Service for use of their training materials in the creation of these modules.

We would also like to thank EDINA and the Data Library at the University of Edinburgh for use of materials from the Research Data MANTRA [online course] in the creation of these modules.