Profiles
Create a new profile
A profile is the files and folders you want to find out about, and the results of
profiling them. You can create profiles by clicking the New
button, or selecting the
File / New
menu item. DROID automatically creates a blank profile for you when it starts.

You can create as many profiles as you like. Each profile appears in its own tab,
underneath the toolbar. You can choose files and
folders as soon as the tab appears using the Add
command,
but it takes a few seconds for a new profile to be ready to run. By default, profiles are "Untitled" until they are saved with a filename.
Once a profile is created, its settings are fixed from the Profile Defaults specified in the preferences window.
Choose files and folders
To add files and folders to a profile, click the Add
button, or
select the Edit / Add files and folders
menu.
You can also right-click with your mouse on the main profile window, and
select the Add option from the popup menu.
A selection window will appear. On the left-hand side is a navigator to explore your folders. The example below shows the "Internet Explorer" folder selected on the left-hand side, and the contents of this folder are shown on the right.
Any files or folders selected on the right will appear in the "Selected resources" text box at the bottom. Select multiple files and folders at the same time by holding down the SHIFT key to select a list, or the CTRL key to individually select or deselect files or folders.
Include subfolders
When a folder is added to a profile, it is often useful to also profile all of its subfolders. Underneath the selected resources text box is checked by default. Uncheck this box if you only want to profile the files directly inside a folder and to ignore any subfolders. If you are only selecting files, this setting makes no difference. Note that if you are profiling inside archival files, they will still be profiled regardless of this setting, which only applies to folders in the file system you are profiling.
Adding your selection
Press the OK
button to add the selected files and folders to your profile, or the Cancel
button to leave the selection window with no changes made. Add
can be used repeatedly to add files and
folders from different locations in your file system to your profile.
Removing your selection
To remove files or folders from the profile, select them in the main window, and either
press the Remove
button, or use the Edit / Remove
files or folders menu item. You can also
right-click with your mouse on the main profile window, and select the Remove option
from the popup menu.

Only the top-level files and folders you have added to a profile can be removed. When you run a profile with a folder, the profiling process will automatically add the files and folders it finds underneath it. After you run a profile, you cannot add or remove files or folders from the profile any more-its specification becomes fixed.
Running a profile
You can start profiling files or folders using the Start
button, or by selecting the Run / Start Identification
menu item.

As files and folders are identified, they are added to the profile, and you can see all the results obtained so far. If you expand a folder, it will show all the files and folders found so far. If you happen to open the folder that is being profiled at that time, you will see its child files and subfolders appearing under it.
Once you have started a profile running, you cannot choose any further files or folders in it. The specification of which files and folders to process in your profile becomes fixed at the point it first begins running. If you want to subsequently profile other files or folders, then you can do this in a new profile. Results held in multiple profiles can be exported and reported on together.
Progress
When your profile first starts running, it counts all the files and folders in your profile, including those in subfolders. Once it has counted them, a progress bar will show how much work has been completed, and how much more there is to do.
The progress bar only gives an estimate of progress. Files which exist inside other files (e.g. zip files) are not accounted for, and in any case, files can be added or removed from your file system while your profile is still running. In most cases, the progress bar gives a fairly good indication of the amount of work remaining to be done. The current file being analysed is also displayed, so even if the progress bar doesn't seem to be moving, you can see that it is still profiling.
Throttling back
You can control how quickly or slowly DROID processes the files in your profile, and this can be done at any time, whether your profile is running or not. This can take the load off your computer, network or disks if running it would impact you, or other users. By default, DROID works as quickly as it can, but you can tell DROID to delay for a short amount of time between each file it processes.
To slow down or speed up DROID, use the slider control at the bottom right of the main window. When the slider is at the far left of the control, DROID will work as fast as it can (a delay of zero). When the slider is at the far right, DROID will wait for one second between processing each file. Even very small delays can reduce the load on networks and file servers, so the normal useful range for throttling is usually between zero and a hundred milliseconds.
Restrictions while profiling
While your profile is running, its filter cannot be enabled, and you cannot save it. If you had enabled a filter previously, it will be automatically disabled when your profile is started. Disabling a filter doesn't get rid of it - it just turns it off temporarily. You can turn it on again when the profile is finished, or paused.
In addition, you cannot run any reports or export a profile while it is running. You can report and export other profiles which are not running.
If you want to do any of the above tasks while a profile is running, you can temporarily pause it, carry out the task, and then resume profiling later.
Pausing
You can pause your profile at any time by pressing the Pause
button, or by selecting the Run / Pause
menu item.

The progress bar will freeze at whatever point it reached, and you will see no further messages about files being analysed in the status bar. Your profile may not pause straight away, as there may be a few outstanding items to be processed in its work queue, particularly if it is in the middle of uncompressing a large archival file. Once paused, there are no restrictions on what you can do with your profile (except that you cannot choose further files or folders once you have started profiling).
Resuming
You can resume profiling by simple running it again, using
the Start
command. Profiling will pick up running
from the point it left off, leaving all results so far intact. You can also save a
paused profile, and then open and resume it at a later date. If the files
where your profile was last paused are now different, DROID will attempt to resume by
locating the nearest place it can successfully re-start from.
Information collected by DROID
Type
DROID categorises the files and folders it profiles as being one of three types:
- File
- Folder
- Archival file
Files have format identifications, but do not have other files or folders inside them. Folders do not have any format identifications or sizes, but can contain other folders, files and archival files inside them. Archival files are like folders, in that they can contain other folders, files and archival files inside them, but they are also files, so they have format identifications and a file size. In this version, DROID can look inside zip, tar and gzip archival files. Archival files may have other archival files nested inside them. DROID will also profile inside these, and in any further nested archival files.
File name
The name of a file, folder or archival file is its name, independent of its location on a disk or inside an archival file. The file name extension (if any) is part of its name. DROID treats all filenames as case-sensitive. For example, 'MYDOCUMENT.DOC' and 'mydocument.doc' are regarded as different file names.
File name extension
File extensions are a convention to indicate the broad type of a file (or archival file) by appending a short string to a file name, separated by a full stop. On Microsoft Windows, the filename extension is used to indicate to the operating system what application to run when double-clicking on the file. Other operating systems do not use the filename extension to determine which application to use. However, filename extensions have become a de-facto standard for indicating the broad type of a file format, and are usually appended to filenames, even when a file is created on other platforms.
DROID extracts the file extension (if any) from a file name or archival file name and stores it separately, to facilitate reporting, sorting and filtering on the extension alone.
File names which begin with a full stop and have no other full stops in them are not regarded as having an extension. For example, a file called '.myfile' has a filename of '.myfile' and a blank extension, whereas '.myfile.doc' has a file name of '.myfile.doc' and an extension of '.doc'. This is because file names starting with a full stop are hidden files in unix file systems, and also because it is not likely that a file name would be entirely composed of a file extension, with no name before it.
DROID treats file extensions as case sensitive. However, it converts all file extensions to lower-case to facilitate filtering and reporting.
Extension mismatch warning
Sometimes file extensions are incorrect for the type of the file, or are missing where there should be one. If DROID detects that the file extension for a file name does not match the formats it has identified, it will issue a file extension mismatch warning. For example, if a file called 'myfile.doc' is identified as a spreadsheet, then a file extension mismatch warning will be issued.
In the graphical user interface, extension mismatch warnings appear as a warning symbol against the file extension itself. When exported to a CSV file, it will appear as a True or False value in its own column.
Location
DROID records the location of every file and folder it profiles. It records location in two ways, using a file Uniform Resource Indicator (URI) , and a file path where one exists. Like file names and extensions, DROID treats file paths and URIs as case sensitive.
There are two ways of recording location because not all files and folders have a file path, although this is the usual method of identifying location in a file system. Any file, folder or archival file which is inside another archival file does not have a defined file path, as it is inside the archival file, not directly in the file system. For example, if we have:
- a folder called 'Folder' on the 'C:\' drive of a Windows computer
- a file called 'Document.doc' inside 'Folder'
- an archival file called 'Archive.zip' inside 'Folder'
- a spreadsheet called 'Spreadsheet.xls' inside 'Archive.zip'
- a folder called 'Another folder' inside 'Archive.zip'
- a picture called 'Large picture.jpg' inside 'Another folder'
- Then we have the following file paths and URIs:
Then we have the following file paths and URIs:
File path | Uniform Resource Indicator (URI) |
---|---|
C:\Folder | file:/C:/Folder/ |
C:\Folder\Document.doc | file:/C:/Folder/Document.doc |
C:\Folder\Archive.zip | file:/C:/Folder/Archive.zip |
zip:/file:/C:/Folder/Archive.zip!Spreadsheet.xls | |
zip:/file:/C:/Folder/Archive.zip!Another%20folder/ | |
zip:/file:/C:/Folder/Archive.zip!Another%20folder/Large%20picture.jpg |
Only files, archival files or folders which are directly accessible in the file system have a file path. Those files and folders which are inside the zip file do not have a file path, but do have a URI, which tells you that they are inside the zip file, where they can be found in it, and where the zip file they are inside is to be found.
The prefixes of a URI tell you what sort of resource is being described by the URI, and the exclamation marks indicate where one type of resource is contained by another. For example, for 'Spreadsheet.xls', we can see that there is a file, C:/Folder/Archive.zip, with the prefix file:/. The exclamation mark (!) tells us that the spreadsheet is contained by the Archive.zip file, and the first prefix zip:/ tells us the type of the containment is a zip file. Note that spaces in URIs are encoded by '%20', and folder separators are always forward slashes. If zip files are contained inside zip files, inside zip files, more prefixes and exclamation marks are added as needed.
URIs mean that all resources profiled by DROID have a unique reference which tells you where the resource is, even if it is inside an archival file, inside another archival file, and so on. This is something that file paths cannot do. However, both are provided, as working with file paths is easier, where they exist for a resource.
File size
The size of a file or archival file is recorded as the number of bytes used by the file. Files can have a size of zero (no content, just a record in the file system). Folders do not have a size.
The size of an archival file is the size of the archival file itself, not the sum of the sizes of its contents. For example, zip files compress their contents, so the sum of the sizes of the files inside a zip file will be bigger than the size of the archival file itself.
Last modified date
Most files, folders and archival files record the date and time on which they were last modified.
It is possible that not every file, folder or archival file will have a last modified date. For example, in some cases, resources inside archival files may not record this date.
It is important to note that last-modified dates can be changed when files are copied from one server to another, so this date may not reflect the last date a user actively modified the content of a file. Also, the content of a file (the data within it) may actually be older than the file itself if a file was copied, or simply typed up manually from an older piece of content.
Some files may have noticeably inaccurate dates, e.g. 1 Jan 1970. In this case, the files will be newer than indicated. This error will likely be caused by the battery failing on the internal clock of the computer from which the document was uploaded, or some other error which caused the date to be set incorrectly.
Number of format identifications
DROID attempts to identify the format of files, including archival files, but not folders. The number of identifications DROID records for a file can vary. It can have
- zero, if DROID can't identify a format at all.
- one, if it is unambiguously identified as a single format.
- (2) more than one, if DROID can't unambiguously decide what format it is in.
In the user interface, the number in brackets indicates the number of possible format identifications made. Clicking on the link will bring up a window showing all the identifications in a table. Multiple possible identifications can happen for three reasons.
- A format is identified purely on the basis of its file extension, so multiple versions of a file format may match the same extension.
- A format has several versions which are very similar and hard for DROID to distinguish between, so DROID will simply report all the possible versions.
- A file may contain patterns, purely by chance, which appear in more than one file format.
File formats
When DROID identifies a file format, it records four pieces of information:
- format name
- format version
- PRONOM Unique Identifier (PUID)
- mime-type
The format name is simply a human-readable name given to a file format or family of file formats, for example, 'Microsoft Word'. The format version is the version of the format, for example '97-2003'. The PUID is a globally unique, persistent identifier for a file format and version, assigned by The National Archives through its PRONOM file format registry. For example, the PUID for the 'Microsoft Word 97-2003' file format is 'fmt/40'.
PUIDs are guaranteed never to change, although new PUIDs may be defined. Clicking on a PUID in DROID will take you to the relevant page for that file format on The National Archives PRONOM website. The website will also help you with some file format names that you may be unfamiliar with. In particular, you may see files identified as 'OLE2 Compound Document Format' (PUID fmt/111) which you can interpret as 'Microsoft Office generic' . In these cases, the file is a Microsoft Office file which DROID could not identify any more closely, but the file extension may indicate more precisely.
Finally, the mime-type is another scheme for identifying broad types of files in use on the internet. They are assigned by a body called the Internet Assigned Numbers Authority. Mime-types are quite broad classifications, so many different file formats will have the same mime-type. For example, the mime-type for 'fmt/40' is 'application/msword' which is shared by all other binary Microsoft Word formats.
Identification methods
DROID has three different methods of identifying file formats:
- extension
- signature
- container
An 'extension' identification means that a format was identified purely on the basis of its file extension. Such an identification may not be reliable, as files can be named in any way, and extensions do not identify formats down to the version level, so such identifications can be quite broad, and may result in multiple identifications.
A 'signature' identification means that a format was identified by finding signature patterns inside the file which are known to occur in particular file formats and versions. This method is quite reliable, as it is fairly unlikely that by chance a file will happen to have a pattern belonging to a different file format than its own.
A 'container' identification means that a format was identified by finding embedded files (possibly with signatures of their own) inside the main file. For example, Microsoft Office 2007 word processing files are actually zip files containing xml files, images or other resources used in the document. A container identification would identify the main file as a Microsoft Office 2007 file, not a zip file. This method is very reliable, as not only does the broad type of container have to be identified (e.g. zip), but the zip file must then be opened, and files inside scanned for further identifications to be made. The original zip identification is removed, and replaced by the Office 2007 identification, on the basis of the files discovered within it.
Note that this is not the same as profiling files inside Archival files, even though container-format files may be based on an archival format like zip. A container-format is a single file format, whose specification relies on specific files being inside it to define the overarching format. An archival file format is a format whose only purpose is to contain other files, and the particular files inside it has no effect on its identification as an archival format.
Content hash
DROID can optionally generate a content hash of the contents of each file and archival file, using the industry standard 'MD5' 'SHA1' or 'SHA256' algorithms. A content hash is a short signature that can be used to identify the content of the file. It is extremely unlikely that two different files will have the same content hash (although this is a remote possibility).
Content hashes can be used to detect files with duplicate content, or can be linked to forensic hash databases to find or exclude files which are widely used (and therefore not unique to your organisation) or which contain illegal content. See detecting duplicate files for more information.
Content hashing is turned off by default, as producing a hash requires reading the entire file, which will slow down DROID significantly.
Status
As DROID profiles your files and folders, it records whether the profiling was successful or not. There are four different statuses which a file or folder can have:
Status | Description |
---|---|
Done | The file or folder was read successfully and any results found recorded. |
Not found | The file or folder was moved or deleted before it could be profiled. |
Access denied | The operating system refused read access to DROID. You will have to grant read permission to those files or folders if you want DROID to profile them. |
Error | An error occurred while trying to read the file. You may be able to determine the cause of the error by examining DROID's log files. |
Saving and loading a profile
DROID can open and save profiles to a file with a .droid filename extension. Note that DROID 6 cannot open profiles created by earlier versions of DROID.
Opening a profile
To open a saved profile, press the Open
button,
or select the File / Open
menu item. A standard file
open selection dialog will appear. Navigate to the droid profile you want to open,
and press the OK
button. The profile will open in a
new tab. If it is a large profile (containing hundreds of thousands of files and
folders), it may take a few minutes to open. A progress bar at the bottom of the screen
shows how much of the profile has been opened so far.
Saving a profile
To save a profile, press the Save
button, or
select the File / Save
menu item. If your profile has
never been saved before, a standard file save dialog will appear, and you can choose where to
save the file and what name it has. If your profile has
been saved before, it will be saved to the place it was opened from, and no file save dialog
will appear. It is a large profile (containing hundreds of thousands of files and
folders), it may take a few minutes to save. A progress bar at the bottom of the
screen shows how much of the profile has been saved so far.
To save an existing profile to another file, select the File / Save As...
menu item. This will always bring up a file save dialog, allowing you
to choose a different file to save to.
Profile files
Profile files are actually zip files, which contain some XML files describing the
profile, and a database containing any results of profiling so far. DROID currently
uses the Apache Derby database, version 10.7, which can be opened using
various third-party tools, such as DB Visualizer The username to connect to a droid database is
droid_user
, and the password is the same as the username.
It is possible to manually edit the profile settings contained within the profile.xml
file. However, it is
not recommended that you do this, as changing settings within a profile may mean that
inconsistent results are returned (if the profile is paused and there are remaining results
to process), or may even cause DROID to crash if the settings conflict with the profile
state. In particular, you must not change which signatures are used by a profile.
We cannot guarantee that other settings are safe to change. Changing this file is
entirely at your own risk.
Explore the results
Results appear in your profile tab as soon as they are profiled. If you have profiled any folders, then you can see the files and subfolders inside them by clicking on the open folder icon to their left. To the right of each resource are columns containing information about each resource.
Sorting
You can sort the columns by clicking on the column header. Clicking once will sort in descending order, another click will sort in ascending order, and a third click will remove the sort. The sort will also group resources that can have resources inside them (e.g. folders or zip files) together, followed by the files, to keep similar resources together.
Filtering
When you have a lot of results, it is useful to be able to filter them, in order to narrow
down on the files or folders of particular interest. You can define a filter for your
results by clicking the Filter
button, or by
selecting the Filter / Edit Filter...
menu item. In the
filter definition dialog, you can add one or more conditions that a file must meet in order
to be visible in your profile, change whether your files must meet all the conditions you
specify, or any of them, and set whether the filter is enabled or not.

Any or All
- All files must meet all filter conditions specified in order to be visible in your profile.
- Any files can meet any filter condition specified in order to be visible in your profile.
Filter enabled
This checkbox allows you to set whether your filter is enabled or not. When enabled,
results in your profile will be filtered. When disabled, your filter conditions will simply
not apply (but their definition will remain). You can also enable or disable filters
via the Filter / Filter on
menu item.
Adding filter conditions
To specify a filter condition, you must fill out three items in a row in the filter table.
- Field-the type of information you want to filter on,
- Operation-the type of comparison to make,
- Values-what to compare the value of the field to.
Fields
Fields are the kind of information you want to filter on. Click on the first column of
the filter condition table, on the drop-down box that says Please select...
.
You can select from the following fields:
- File name-the (case-sensitive) name of a file or folder
- File size-the size of a file in bytes
- File extension-the (case-sensitive) file name extension
- Last modified date - the data a file or folder was last modified
- Resource type-whether the resource is a folder, file or archival file (a file that contains other files - e.g. a zip file).
- Mime type-the mime-type of a file
- PUID - the PRONOM Unique Identifier of a file.
- Format name-the name of a format identified by a PUID.
- Identification method-the method by which a file or archival file was identified.
- Job status-whether a file has been processed, or had an error when profiled.
- Extension mismatch-whether the format identified is consistent with the file extension.
For more information on these fields and the information contained in them, please see Information collected by DROID. You can specify more than one filter condition on the same field. Notice that if you are filtering using the All method, this makes it possible to create filters which no file can meet (e.g. size < 100 and size > 100)!
Operations
Once you have selected the field you want to filter on, you can specify what kind of comparison operation should be performed on it. The available operations vary by the type of field you have selected:
- Numbers and Dates: less than (<), less or equal (<=), equals (=), greater than (=), greater or equal (>=), not equals (<>)
- Text (case-sensitive): equals (=), not equals (<>), starts with, ends with, contains, does not start with, does not end with, does not contain.
- Sets: some types of field can have one or more values. selected out of a defined set of values (PUID, Mime type, Identification method, Job status). To compare sets, you can specify that there are "any of" the values in the specified set, or "none of" the values in the specified set of values.
Values
Like the operations, the values you provide depend on the type of fields you are filtering on:
- Numbers: enter a positive whole number. Only digits can be entered-no decimal points, plus or minus signs are allowed.
- Dates: provide the day, month and year in the boxes provided.
- Text: enter your text into the box provided.
-
Sets: You cannot enter set values directly into the value area-you must select
them using the set selection dialog. To bring up
the dialog, click the
...
button to the right of the value area. Once you have selected your values, a read-only representation of them will appear in the value area. You can always edit your sets by bringing up the set selection dialog again.
Set selection dialog
The set selection dialog lets you select one or more values from a list.

On the left-hand side of the dialog are all the available values to select. On the
right-hand side are the values you want to filter on. To add values to your selection,
select the values you would like to add on the left-hand side, and press the Add
button
between the two panes. You can select values individually, or you can hold down the
SHIFT key to select a list of values. If you hold down the CTRL key, you can select or
deselect multiple items individually. To remove values from your selection, highlight
the values you want to remove on the right-hand side, using the same techniques as adding,
and press the Remove
button.
Once you are happy with your selection, press the OK
button at the
bottom of the dialog. Your selected values will be placed in the value box in the
filter dialog. If you want to cancel any changes you have made, press the Cancel
button.
Removing Filter Conditions
To remove a filter condition, click the Remove
button at the far right of the row.
You cannot remove the final row in the filter table, which always shows Please select...
in the field column, as this row enables you to add new filter conditions. Unless you actually specify a field,
this row does not contribute to the filter specification.
Load and Save filters
If you have a filter you want to re-use, you can save filters to a file, and load them back from a previously saved file. Clicking on these buttons brings up a file selection dialog window.
Apply and cancel
If you are happy with your filter definition, press the Apply
button at the
bottom of the screen. Your filter specification will be associated with your profile,
and if you have enabled the filter, it will be applied to the profile immediately. If
you press the Cancel
button, all changes you have made to the filter will be discarded, and
any previous filter will be restored as it was before you opened the filter dialog.
Reports
Selecting profiles to report on
DROID can create a variety of reports containing statistics about the files and folders in your profiles, and save the report as several different kinds of file.
To create a summary report of one or more profiles, press the Report
button,
or select the Report / Generate Report
menu option.
This will bring up the profile selection dialog, which allows you to select which
profiles to report on.

If a selected profile has an active filter, this filter will be used when generating the report. So you can produce different reports on the same profiles, by using different filters. For example, you could filter out all files which are very large, giving you averages which are closer to the mean values normally encountered. Or you could filter out everything except document formats, letting you produce statistical reports on document types only.
Available reports
Once you have selected some profiles to report on, select which report you wish to generate. DROID ships with eleven pre-defined reports:
Report Name | Description |
---|---|
File count and sizes | The count, total size, and minimum, maximum and average size of all files in your profiles |
Total count of files and folders | A count of all files and folders in your profiles |
Total unreadable files | A count of all the unreadable files in your profiles |
Total unreadable folders | A count of all the unreadable folders in your profiles |
File count and sizes by file extension | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by their file extensions. |
File count and sizes by file format PUID | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by their file format PUIDs. |
File count and sizes by mime type | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by their mime types. |
File count and sizes by month last modified | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by the month they were last modified in. Months are represented as a number from 1 (January) to 12 (December). |
File count and sizes by year last modified | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by the year they were last modified. |
File count and sizes by year and month last modified | The count, total size, and minimum, maximum and average size of all files in your profiles broken down by the year and month they were last modified. Months are represented as a number from 1 (January) to 12 (December). |
Comprehensive breakdown | A report combining all of the above reports into a single report. This report may take a long time to generate over large profiles. |
Building your report
Next, press the Report on profiles...
button to generate your report.

Statistics presented are broken down by profile, and aggregated across all the profiles you have selected to report on.
Exporting your report
Finally, if you want to save your report, press the Export...
button.
This will enable you to save your report in a variety of file formats. By
default, all reports can be saved as:
- Web page
- Text
- DROID Report XML
In addition, some reports have special output formats defined. As shipped, DROID 6 includes a PLANETS XML export option for the Comprehensive Breakdown report. All reports are generated from the DROID Report XML, so you can use this format to transform into any other formats you require, using XSLT technology. All report definitions, and any associated transforms are located in the report_definitions subfolder underneath your user settings folder.