Menu
Menu
Contact Us Contact Us
Contact
Job Seekers

Glossary

A

Analytics – A general term to describe using technology to get a better understanding of a document collection.
Archive – The process of preserving or storing data for an indefinite period of time.
ASCII – 7-bit text encoding limited to 128 english characters and symbols
Attachment – Usually refers to emails that have one or more additional documents embedded or attached to the email. However, this term is also used to describe any electronic file that has an embedded file; the embedded file is the attachment.

< Back to Knowledge Base

B

Bates Number – A document identification technique where each page is a document is assigned a unique sequential number.
Blowback – Printing of electronic files to paper format.
Boolean Search – a search technique that utilizes Boolean Logic to connect individual keywords or phrases within a single search query using the following operators (And, Or, and Not, within, not within)
Backup (full) – The process of preserving or storing data for an indefinite period of time. Often the backup is stored on removable media or away from the current data source. A “full” backup is a complete copy of the original data.
Backup (incremental) – An incremental backup is a partial or differential backup that is taken between full backups. It must be used in conjunction with a full backup.
Backup tape – Backup tape is a removable media that is used for backing up files. The tapes may be stored in a safe place in case of a disaster.
Bit – A bit is the basic unit of information in computing and digital communications. A bit can have only one of two values, and may therefore be physically implemented with a two-state device. These values are most commonly represented as 0 and 1.
Bit-bit copy – The process of copying everything on a digital media source down to the smallest divisible part. This is the foundation for creating a Forensic Image.
Byte – The byte is a unit of digital information in computing and telecommunications that most commonly consists of eight bits.

< Back to Knowledge Base

C

Categorization – This term has several meanings, but is often used to describe the process of automatically grouping documents that have similar concepts using conceptual analytics (Latent Symantec Index).
Child Document – This term is used to describe a file that is attached to another file. Usually this term is used to describe an email with attachments, the attachments are the Child or Children to the Parent email.
Clustering – This term is used to describe the process of automatically grouping documents that have similar concepts using conceptual analytics (Latent Symantec Index).
Coding – This is a term that describes adding information to a document record. Sometimes the added information is subjective and other times the information is objective.
Conceptual Analytics – Term used to describe several tools that use a Latent Symantec Index to group conceptually similar documents together.
Certified Forensic Computer Examiner (CFCE) – The CFCE credential was the first certification demonstrating competency in computer forensics in relation to Windows based computers. The CFCE training and certification is conducted by the International Association of Computer Investigative Specialists (IACIS), a non-profit, all volunteer organization of current and former law enforcement members.
Corrupt Data or Corrupt File – This refers to a file or group of files that have been damaged and no longer open as intended.
Chain of Custody – In order for evidence to be admissible in a Court, you must be able to show who had control of the evidence from the time it was collected to the time it is presented to the Court.
Custodian – This refers to a person who is the subject of a litigation hold.

< Back to Knowledge Base

D

Data Extraction – The process of parsing data from electronic documents to identify their metadata and body.
De-duplication – The process of identifying exact duplicate documents and removing them from the processing and or review process.
De-NIST – Is the process of identify and removing files that are non-user created by matching them up to a list of files (using HASH values) that are known to be created by software companies.
Discovery – Part of the legal process that allows all the parties in a proceeding to find out more about the case.
Document Family – This refers to the relationship between emails and their attachments or other paper documents and their attachments.

< Back to Knowledge Base

E

Early Case Assessment (ECA) – Software designed to quickly determine the type of documents in a collection and to verify the proposed filtering parameters. Sometimes this is also known as Early Data Assessment (EDA).
Early Data Assessment (EDA) – Software designed to quickly determine the type of documents in a collection and to verify the proposed filtering parameters. Sometimes this is also known as Early Case Assessment (ECA).
EDRM Model – Acronym for Electronic Discovery Reference Model. It is a model workflow for the handling of electronic data in the legal Discovery process.
Electronic discovery – This is a term used to describe the process of preserving, collecting, processing, reviewing and sometime producing documents in the legal phase of Discovery.
Electronic evidence – Any evidence that is stored in an electronic format.
Electronically Stored information (ESI) – This is a phrase (or ESI for short) that was first used in the amended Federal Rules of Civil Procedure in 2006. Its purpose is to broaden the rules of discovery to include any and all forms of electronic data.
Email Threading – This is a process that uses patter recognition to identify emails that are part of the same conversation. The process may also identify the last email or the one at the end of the conversation that contains all the parts of the conversation.
Encase – One of the most recognized and court-accepted software platforms for preserve electronic evidence. Encase is capable of creating forensic (bit-by-bit) images in the L01, Lx01, E01, and Ex01 formats.
Encryption – Encryption is the process of encoding messages or information in such a way that only authorized parties can read it. Encryption does not of itself prevent interception, but denies the message content to the interceptor.

< Back to Knowledge Base

F

F1 – The F1 score (also F-score or F-measure) is a measure of a test’s accuracy. It considers both the precision p and the recall r of the test to compute the score
Federal Rules of Civil Procedure (FRCP) – the rules that govern electronic discovery and other aspects of the civil legal process.
File Compression – A technology for storing data in fewer bits. It makes data smaller so less disk space is used. (.Zip, .Rar)
Filtering – The process of culling down a population of documents in order to make it more manageable. This often evolves the use of key word, date and file type filters.
File conversion – The process of taking an application file designed to work with a particular application and modifying it to work with another application.
Forensic collection – The handling of electronic data including collection, examination and analysis, in a manner the ensures its authenticity, so as to provide for its admission as evidence in a court of law.
Forensic Image – This is a complete bit-by-bit copy of a source drive that is created using software specifically designed for that purpose.
FTK – One of the most recognized and court-accepted software platforms for preserve electronic evidence. FTK is capable of creating forensic (bit-by-bit) images in the Raw (dd), SMART, E01 and AFF formats.

< Back to Knowledge Base

G

Gigabyte – Approximately one million bytes.

< Back to Knowledge Base

H

Hibernate – This refer to the process of moving a database or population of documents in host platform and moving them to a semi-active or near-line state. Usually this process is done to save hosting and software licensing cost and to allow quick access to data if required.
Hashing – An algorithm that generates a unique value for each document. It is referred to as a digital fingerprint of a document. It is used to authenticate documents and to identify duplicates.

< Back to Knowledge Base

I

Image Processing – Refers to a format of structuring data for the purpose of loading the data into a document review platform that included rendering an image each document.
Index – A list of words in a database that is used by a software to provide fast access to searched information.
Ingestion – This usually describes the process of loading data to a review or ECA platform.
Image mounting – This is the process of connecting a forensic image to a computer system with specialized software that allows the user to view the contents of the image as if the image was a live drive.
Image restore – This is the process of taking a forensic image and placing the contents of the image on a live computer so that the original files are recognized by the operating system.

< Back to Knowledge Base

J

JPG – Standard computer file format for storing graphic images in a compressed form for general use. JPEG images are compressed using a mathematical algorithm. A variety of encoding processes can be used, depending on whether the user’s goal is the highest quality of image (lossless) or smallest file size (lossy). The JPEG and GIF formats are the most commonly used graphics formats on the Internet for lossy and lossless data compression, respectively.

< Back to Knowledge Base

K

< Back to Knowledge Base

L

Latent Semantic Index – Is an indexing and retrieval method that uses a mathematical technique called singular value decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words that are used in the same contexts tend to have similar meanings.
Legal Hold – A legal hold is a process that an organization uses to preserve all forms of relevant information when litigation is reasonably anticipated.
Load File – A file or group of files in a structured format designed to allow data to import into a document review platform.

< Back to Knowledge Base

M

Mailbox – A container for electronic mail data (messages and attachments) that contains one or more messages.
Megabyte – Approximately one billion bytes
Metadata – Often referred to as data about data, it is the information that describes the characteristics of a file.
Migration – This usually describes the process of moving data from one review or ECA platform to another or from any source to any target.

< Back to Knowledge Base

N

Native Format – This describes an application file that is in the same format it was originally in when created.
Native Processing – Refers to a format of structuring data for the purpose of loading the data into a document review platform that does not included rendering an image each document.
Near-duplicate – This describes a file that is very similar in the actual text but not identical.
NIST – Is the acronym for the Nation Institute of Standards and Technology. They group keeps a list of files that are created by various software manufacturers. The list is used to help identify files on a computer system that are not user created and are therefore not usually important to any legal process.

< Back to Knowledge Base

O

Optical Character Recognition (OCR) – A software process that is run on pictures or images of text files to attempt to recognize the characters of the text so that the text may be indexed and searched.

< Back to Knowledge Base

P

Parent Document – This term is used to describe a file that has other files attached. Usually this term is used to describe an email with attachments, the attachments are the Child or Children to the Parent email.
Precision – In search results analysis, precision is a measure of the accuracy of the results. It is the fraction of retrieved instances that are relevant.
Predictive Coding – Is a process to automate the categorization of documents into two categories, responsive and non-responsive. The process involves a human making review calls as examples for the software to use to determine the responsiveness of other documents based on conceptual similarity. This is also knows as Assisted Review or Technology Assisted Review (TAR).
Processing – The process of taking unstructured data and structuring it for loading into a document review platform. This process typically involves extracting metadata and full text as well as filtering the documents.
Production – This is a term used to describe preparing and delivering documents to another party in response to a subpoena.

< Back to Knowledge Base

Q

< Back to Knowledge Base

R

Recall – In search results analysis, recall is a measure of the completeness of the results. It is the fraction of relevant instances that are retrieved.
Redact – The process of hiding or removing parts of a document with black or white blocks. This process is used to hide privileged information.

< Back to Knowledge Base

S

Sampling – A subset of a document population used to test characteristis of the entire population.
Seed/Training Set – A subset of documents used to teach a learning algorithm in conceptual analytics
Structural Analytics – This is a term to describe a bunch of applications that help make a document review more efficient. The applications use the documents text.
Synchronization – Term used to describe the process of moving files from a source drive to a target drive. This process is typically performed with specialized software that manages the process to make sure every file is successfully copied from source to target.
Spoliation – The destruction or alteration of data that might be relevant to a legal matter.

< Back to Knowledge Base

T

Technology Assisted Review (TAR) – Is a process to automate the categorization of documents into two categories, responsive and non-responsive. The process involves a human making review calls as examples for the software to use to determine the responsiveness of other documents based on conceptual similarity. This is also knows as Predictive Coding.
Terabyte – Approximately one trillion bytes
TIFF – An acronym for Tagged Image File Format. A TIFF is an image file format that is commonly used to produce documents in a static form with a branded identification number on the bottom right.

< Back to Knowledge Base

U

Unicode – Is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world’s writing systems.
Unicode/UTF – 27 bit text encoding designed to include every character in every language
Unitization – The process of dividing pages of documents into logical documents units, including identifying parent and children documents.
Unallocated space – Is the space created on a hard drive when a file is marked for deletion. This space is no longer allocated to a specific file. Until it is overwritten, it still contains the previous data and can be retrieved.

< Back to Knowledge Base

V

Verification – Process of confirming that a forensic image is an exact copy of the original by taking a HASH value of each.

< Back to Knowledge Base

W

Write blocker – Hardware or Software designed to only allow data to transfer in one direction. It is used to protect a source drive against accidentally writing back to the source drive.

< Back to Knowledge Base

X

< Back to Knowledge Base

Y

< Back to Knowledge Base

Z

< Back to Knowledge Base