Child pages
  • Common terminology in Document Management
Skip to end of metadata
Go to start of metadata


For definitions on specific functions within Nolij Web (such as ‘document viewer’, ‘folder objects panel’ ‘system objects panel’, ‘form viewer’, etc.) please reference the help user guide within Nolij Web.


ADF 

Automatic Document Feeder. This is the means by which a scanner feeds the paper document.

 

Annotations 

The changes or additions made to a document using sticky notes, a highlighter, or other electronic tools. Document images or text can be highlighted in different colours, redacted (blacked-out or whited-out), stamped (e.g. “FAXED” or “CONFIDENTIAL”), or have electronic sticky notes attached. Annotations should be overlaid and not change the original document.


Archive 

Remove a document from view, but retain it in the database.  Document can be retrieved.  


Auto-Index

An OCR process to read a document and determine document type and folder ID.  


Auto-Load 

Ability to take a bunch of documents and put them into a batch folder or folders within Nolij.  Reference OCR and Auto-Index 


ASCII 

American Standard Code for Information Interchange. Used to define computer text that was built on a set of 255 alphanumeric and control characters. ASCII has been a standard, non-proprietary text format since 1963.

 

ASP 

Active Server Pages. A technology that simplifies customization and integration of Web applications. ASPs reside on a Web server and contain a mixture of HTML code and server-side scripts. An example of ASP usage includes having a server accept a request from a client, perform a query on a database, and then return the results of the query in HTML format for viewing by a web browser.

 

Bar Code 

A small pattern of lines that is read by a laser or an optical scanner, and which corresponds to a record in a database. An add-on component to imaging software, bar-code recognition is designed to increase the speed with which documents can be archived.

Batch Folder A folder into which electronic documents may be scanned.  Batches can be given custom names that describe the contents of the batch. 


Batch Processing 

The name of the technique used to input a large amount of information in a single step, as opposed to individual processes.

 

Bitmap/Bitmapped 

See Raster/Rasterized.

 

BMP 

A native file format of Windows for storing images called “bitmaps.”


Boolean Logic 

The use of the terms “AND,” “OR” and “NOT” in conducting searches. Used to widen or narrow the scope of a search.

 

Briefcase 

A method to simplify the transport of a group of documents from one computer to another.

 

Burn (CDs or DVDs) 

To record or write data on a CD or DVD.

Business System Integration 

The linking of university systems [such as PeopleSoft and other business systems] to ECM so that documents can be added or retrieved with little or no user input needed.  

  

Caching (of Images) 

The temporary storage of image files on a hard disk for later migration to permanent storage, like an optical or CD jukebox.

 

Capture Documents (Automated) 

A way to capture and store documents automatically. 

Examples of ways to accomplish this is by

-Use of a consistent identifier

-OCR text -Barcode images

-Electronic forms and more


Capture Documents (Manual) 

A way to capture and store documents via Scanning or Drag and Drop. 

Examples of ways to accomplish this is by

-Scan directly from the browser into Nolij Web

-Drag and Drop files from Windows Explorer into Nolij Web


CD Publishing 

An alternative to photocopying large volumes of paper documents. This method involves coupling image and text documents with viewer software on CDs. Sometimes search software is included on the CDs to enhance search capabilities.


CD-R 

Short for CD-Recordable. This is a CD which can be written (or recorded) only once. It can be copied to distribute a large amount of data. CD-Rs can be read on any CD-ROM drive whether on a standalone computer or network system. This makes interchange between systems easier.

 

CD-ROM Drive 

A computer drive that reads compact discs.

 

Client-Server Architecture vs. File-Sharing 

Two common application software architectures found on computer networks. With file-sharing applications, all searches occur on the workstation, while the document database resides on the server. With client-server architecture, CPU intensive processes (such as searching and indexing) are completed on the server, while image viewing and OCR occur on the client. File-sharing applications are easier to develop, but they tend to generate tremendous network data traffic in document imaging applications. They also expose the database to corruption through workstation interruptions. Client-server applications are more difficult to develop, but dramatically reduce network data traffic and insulate the database from workstation interruptions.

 

See also n-tier Architecture.

 

COLD 

Computer Output to Laser Disk. A process that outputs electronic records and printed reports to laser disk instead of a printer. Can be used to replace COM (Computer Output to Microfilm) or printed reports such as green-bar.

 

COM 

Computer Output to Microfilm. A process that outputs electronic records and computer generated reports to microfilm.

 

COM Object 

Component Object Model. COM refers to both a specification and implementation developed by Microsoft Corporation, which provides a framework for integrating components of a software application. COM allows developers to build software by assembling reusable components from different vendors.

 

Compression Ratio 

The ratio of the file sizes of a compressed file to an uncompressed file, e.g., with a 20:1 compression ratio, an uncompressed file of 1 MB is compressed to 50 KB.

 

CPU 

Central Processing Unit. The “brain” of the computer.

 

De-shading 

Removing shaded areas to render images more easily recognizable by OCR. De- shading software typically searches for areas with a regular pattern of tiny dots.

 

De-skewing 

The process of straightening skewed (off-centre) images. De-skewing is one of the image enhancements that can improve OCR accuracy. Documents can become skewed when they are scanned or faxed.

 

De-speckling 

Removing isolated speckles from an image file. Speckles can develop when a document is scanned or faxed.


Dithering 

The process of converting greys to different densities of black dots, usually for the purposes of printing or storing colour or greyscale images as black and white images.

  

Document 

Any type of electronic file, including scanned electronic documents.

 

Document Imaging 

Software used to store, manage, retrieve and distribute documents quickly and easily on the computer.

 

Drag-and-Drop 

The movement of on-screen objects by dragging them across the screen with the mouse.

 

Duplex Scanners vs. Double-Sided Scanning 

Duplex scanners automatically scan both sides of a double-sided page, producing two images at once. Double-sided scanning uses a single sided scanner to scan double-sided pages, scanning one collated stack of paper, then flipping it over and scanning the other side.

 

DVD 

Digital Video Disc or Digital Versatile Disc. A plastic disc, like a CD, on which data can be written and read. DVDs are faster, can hold more information, and can support more data formats than CDs.

 

Electronic Document Management 

Imaging software that helps manage electronic documents.

 

EMF

Microsoft Windows Enhanced Metafile – an image type.

 

Erasable Optical Drive 

A type of optical drive that uses erasable optical discs.

 

Flatbed Scanner 

A flat-surface scanner that allows users to input books and other documents.

 

Folder Browser 

A system of on-screen folders (usually hierarchical or “stacked”) used to organize documents. For example, the Windows Explorer program in Microsoft Windows is a type of folder browser that displays the directories on your disk.

 

Forms Processing 

A specialized imaging application designed for handling pre-printed forms. Forms processing systems often use high-end (or multiple) OCR engines and elaborate data validation routines to extract hand-written or poor quality print from forms that go into a database. This type of imaging application faces major challenges, since many of the documents scanned were never designed for imaging or OCR.

 

Full-text Indexing and Search 

Enables the retrieval of documents by either their word or phrase content. Every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.

 

Fuzzy Logic 

A full-text search procedure that looks for exact matches as well as similarities to the search criteria, in order to compensate for spelling or OCR errors.

 

GIF 

Graphics Interchange Format. CompuServe®’s native file format for storing images.


Gigabyte 

230 (approximately one billion) bytes, or 1024 megabytes. In terms of image storage capacity, one gigabyte equals approximately 17,000 81/2” x 11” pages scanned at 300 dpi, stored as TIFF Group IV images.

 

Greyscale 

See “Scale-to-Grey.”

 

Hierarchical Storage Management (HSM) 

Software that automatically migrates files from on-line to near-line storage media, usually on the basis of the age or frequency of use of the files.

 

ICR 

Intelligent Character Recognition. A software process that recognizes handwritten and printed text as alphanumeric characters.

 

Image Enabling 

Allows for fast, straightforward manipulation of an imaging application through third- party software. For example, image enabling allows for launching the imaging client interface, displaying search results in the client, and bringing up the scan dialogue box, all from within a third party application.

 

Image Processing Card (IPC) 

A board mounted in either the computer, or scanner or printer that facilitates the acquisition and display of images. The primary function of most IPCs is the rapid compression and decompression of image files.

 

Index Fields 

Database fields used to categorize and organize documents. Often user-defined, these fields can be used for searches.

 

Internet Publishing 

Specialized imaging software that allows large volumes of paper documents to be published on the Internet or intranet. These files can be made available to other departments, offsite colleagues or the public for searching, viewing and printing.

 

IPX/SPX 

Communications protocol used by Novell networks.

 

ISIS and TWAIN Scanner Drivers 

Specialized applications used for communication between scanners and computers.

 

ISO 9660 CD Format 

The International Standards Organization format for creating CD-ROMs that can be read worldwide.

 

JPEG 

Joint Photographic Experts Group (JPEG or JPG). An image compression format used for storing colour photographs and images.

 

Jukebox 

A mass storage device that holds optical disks and loads them into a drive.

 

Key Field 

Database fields used for document searches and retrieval. Synonymous with

“index field.”

 

Magneto-Optical Drive 

A drive that combines laser and magnetic technology to create high-capacity erasable storage.

 

MAPI 

Mail Application Program Interface. This Windows software standard has become a popular e-mail interface and is used by MS Exchange, GroupWise, and other e-mail packages.

 

MFP 

Multifunction Printer or Multifunctional Peripheral. A device that performs any combination of scanning, printing, faxing, or copying.

 

Multi-page TIFF 

See TIFF.

 

Near-Line 

Documents stored on optical disks or compact disks that are housed in the jukebox

or CD changer and can be retrieved without human intervention.

 

NetWare Loadable Module (NLM) 

An application that runs as part of the network operating system (NOS) of a Novell

NetWare server.

 

NT 

Network Technology. Refers to Microsoft Windows NT server and workstation software.

 

n-tier Architecture 

The term can apply to the physical or logical architecture of computing. The term refers to a method of distributed computing in which the processing of a specific application occurs over “n” number of machines across a network. Typical tiers include a data tier, business logic tier, and a presentation tier, wherein a given machine will perform the individualized tasks of a tier. Scalability is among the advantages of n-tier architecture.

 

OCR 

Optical Character Recognition (OCR). A software process that recognizes printed text as alphanumeric characters.

 

Off-Line 

Archival documents stored on optical disks or compact disks that are not connected or installed in the computer, but instead require human intervention to be accessed.

 

On-Line 

Documents stored on the hard drive or magnetic disk of a computer that are available immediately.

 

Optical Disks 

Computer media similar to a compact disc that cannot be rewritten. An optical drive uses a laser to read the stored data.

 

Optical Jukebox

See “Jukebox.”


Phase Change 

A method of storing information on rewritable optical disks.

 

Pixel 

Picture Element. A single dot in an image. It can be black and white, greyscale or colour.


Portable Volumes 

A feature that facilitates the moving of large volumes of documents without requiring copying multiple files. Portable volumes enable individual CDs to be easily regrouped, detached and reattached to different databases for a broader information exchange.

 

RAID 

Redundant Array of Independent Disks. A collection of hard disks that act as a single unit. Files on RAID drives can be duplicated (“mirrored”) to preserve data. RAID systems may vary in levels of redundancy, with no redundancy being a single, non- mirrored disk as level 0, two disks that mirror each other as level 1, on up to level 5, the most common.

 

Raster/Rasterized (Raster or Bitmap Drawing) 

A method of representing an image with a grid (or “map”) of dots or pixels. Typical raster file formats are GIF, JPEG, TIFF, PCX, BMP, etc.

 

Redaction 

A type of document annotation that provides word-level security by concealing from view specific portions of sensitive documents. Like all annotations in a document imaging system, redactions should be image overlays that protect information but do not alter original document images.


 

Region (of an image) 

An area of an image file that is selected for specialized processing. Also called a “zone.”

 

Scale-to-Grey 

An option to display a black and white image file in an enhanced mode, making it easier to view. A scale-to-grey display uses gray shading to fill in gaps or jumps (known as aliasing) that occur when displaying an image file on a computer screen. Also known as greyscale.

 

Scalability 

The capacity of a system to expand without requiring major reconfiguration or re- entry of data. Multiple servers or additional storage can be easily added.

 

Scanner 

An input device commonly used to convert paper documents into computer images. Scanner devices are also available to scan microfilm and microfiche.

 

SCSI 

Small Computer Systems Interface. Pronounced “skuzzy.” A standard for attaching peripherals (notably mass storage devices and scanners) to computers. SCSI allows for up to 7 devices to be attached in a chain via cables. The current SCSI standard is “SCSI II,” also known as “Fast SCSI.”

 

SCSI Scanner Interface 

The device used to connect a scanner with a computer.

 

Single-Page TIFF 

See TIFF.

 

SQL 

Structured Query Language. The popular standard for running database searches

(queries) and reports.

 

TCP/IP 

Network communications protocol. This is the protocol used by the Internet.

 

Templates 

Sets of index fields for documents.

 

Thumbnails 

Small versions of an image used for quick overviews or to get a general idea of what an image looks like.

 

TIFF 

Tagged Image File Format. A non-proprietary raster image format, in wide use since

1981, which allows for several different types of compression. TIFFs may be either single or multipage files. A single-page TIFF is a single image of one page of a

document. A multi-page TIFF is a large single file consisting of multiple document pages. Document imaging systems that store documents as single-page TIFFs offer significant network performance benefits over multi-page TIFF systems.

 

TIFF Group III (compression) 

A one-dimensional compression format for storing black and white images that is utilized by most fax machines.

 

TIFF Group IV (compression) 

A two-dimensional compression format for storing black and white images. Typically compresses at a 20-to-1 ratio for standard business documents.

 

Video Scanner Interface 

A type of device used to connect scanners with computers. Scanners with this interface require a scanner control board designed by Kofax, Xionics or Dunord.

 

Workflow, Ad Hoc 

A simple manual process by which documents can be moved around a multi-user imaging system on an “as-needed” basis.

 

Workflow, Rules-Based 

A programmed series of automated steps that routes documents to various users on a multiuser imaging system.

 

WORM Disks 

Write-Once-Read-Many Disks. A popular archival storage media during the 1980s. Acknowledged as the first optical disks, they are primarily used to store archives of data that cannot be altered. WORM disks are created by standalone PCs and cannot be used on the network, unlike CD-Rs.

 

ZIP 

A common file compression format that allows quick and easy storage for transport.

 

Zone OCR 

An add-on feature of the imaging software that populates document templates by reading certain regions or zones of a document, and then placing the text into a document index field.



Need more help? Contact the ITS Service Desk.