|
Commonly Used Imaging & ECM Terms
ADF (Automatic Document Feeder)
The feature on the scanner that allows an entire document to be scanned at once. The larger the ADF on the scanner, the more pages the scanner can scan without stopping.
Annotations
The changes or additions made to a document using sticky notes, a highlighter, or other electronic tools. Document images or text can be highlighted in different colors, redacted (blacked-out or whited-out), stamped (e.g. “FAXED” or “CONFIDENTIAL”), or have electronic sticky notes attached. Annotations should be overlaid and not change the original document.
Backfile Conversion
Process of converting files/documents that have accumulated over a period of time. Used in reference to projects to microfilm or scan and digitize documents.
Back-up
A copy of data for short-term storage as an assurance against loss of master data. The process of producing a back-up copy
BPM (Business Process Management)
Bar Code
A small pattern of vertical lines that is read by a laser or an optical scanner, and which corresponds to a record in a database. An add-on component to imaging software, this feature is designed to increase the speed with which documents can be archived.
Batch Processing
The name of the technique used to input a large amount of information in a single step, as opposed to individual processes.
Client
The user interface of an application that allows users to perform functions like scan, retrieve, view and route documents. A client can be a “thick” client (a Windows application) or a “thin” client (a website)
Client/Server Architecture
A networked computer architecture where numerous clients are connected to one or more server computers. In a DM system, PC or workstation clients are used for viewing, editing, image processing, etc. Servers hold the index database and manage the image files.
CMS (Content Management System)
Commonly refers to a web content management system or ECM (Electronic Content Management)
Content Management
Term used to refer to systems that manage the content objects which form documents. Can be used to differentiate compound document management systems from simple document management systems. Increasingly being used as an alternative, technically more accurate, term for an electronic document management system.
Deskew
The process of straightening skewed (off-center) images. De-skewing is one of the image enhancements that can improve OCR accuracy. Documents often become skewed when they are scanned or faxed.
Despeckle
Removing isolated speckles from an image file. Speckles often develop when a document is scanned or faxed.
Document Imaging
Software used to store, manage, retrieve and distribute documents quickly and easily on the computer.
Document Management
Document management technology helps organizations better manage the creation, revision, approval, and consumption of electronic documents. It provides key features such as library services, document profiling, searching, check-in, check-out, version control, revision history, and document security.
DoD 5015
US Department of Defense Design criteria standard for electronic records management software applications. Two-part standard and compliance testing regime for records management systems in US government.
Duplex
Duplex scanners automatically scan both sides of a double-sided page, producing two images at once. Double-sided scanning uses a single-sided scanner to scan double-sided pages, scanning one collated stack of paper, then flipping it over and scanning the other side.
ECM (Enterprise Content Management)
The strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists.
Electronic Document Management System
System that manages the content of electronic documents and provides facilities for version control and access control. Also referred to as Document Management Systems and Electronic Content Management systems
Flatbed Scanner
A flat-surface scanner that allows users to input books and other documents.
Forms Processing
A specialized imaging application designed for handling pre-printed forms. Forms processing systems often use high-end (or multiple) OCR engines and elaborate data validation routines to extract hand-written or poor quality print from forms that go into a database. This type of imaging application faces major challenges, since many of the documents scanned were never designed for imaging or OCR.
Full-text Indexing and Search
Enables the retrieval of documents by either their word or phrase content. Every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.
ICR
Intelligent Character Recognition. A software process that recognizes handwritten and printed text as alphanumeric characters.
Image Enabling
Allows for fast, straightforward manipulation of a client through third-party applications. In examples with Laserfiche, image enabling allows for launching the Laserfiche client, displaying search results in the client, and bringing up the scan dialogue box, all from within a third party application.
Image processing
The manipulation of digital images after they have been scanned and digitized. Includes rotation, zoom, enhancement, analysis, etc.
Imaging
The process of capturing, storing and retrieving information, regardless of its original format, using micrographics and/or scanning and optical disk technologies.
Index Fields
Database fields used to categorize and organize documents. Often user-defined, these fields can be used for searches.
Indexing
An essential part of the capture process, creates metadata from scanned documents (customer ID number, for example) so the document can be found. Indexing can be based on keywords or full-text.
Intranet
Essentially a private Internet. It makes use of the same technology as the Internet but is used to establish a network that is private to a company or organization. It resides behind a ‘firewall’ and cannot be accessed by people outside
MFP (Multifunction Printer or Multifunctional Peripheral)
A device that performs any combination of scanning, printing, faxing, or copy.
NAS (Network Attached Storage)
A disk array storage system that is set up with its own TCP/IP network address rather than being attached to the department computer that is serving applications to a network’s workstation users. By removing storage access and its management from the department server, both application programming and files can be served faster because they are not competing for the same processor resources.
OCR (Optical Character Recognition)
A software process that recognizes printed text as alphanumeric characters.
ODBC (Open Database Connectivity)
A standard for linking client workstation with server database.
OMR (Optical Mark Recognition)
A recognition technology for detecting the presence or absence of marks in a defined space, e.g. ticks or crosses in boxes, etc.
Portal
Literally a gateway. Used to mean intelligent browser software that allows users to personalize their search engine, to define websites and document libraries and subject interest profile so that they are altered when new documents that meet their subject interest profiles are added to those sites/libraries. Provides unified access to internal document repositories and third party websites. Can be divided into personal, workgroup, corporate and enterprise portals.
PPM (Pages Per Minute)
Commonly used measure for the output speed of printers and input speed of scanners.
RAID
Redundant Array of Independent Disks. A collection of hard disks that act as a single unit. Files on RAID drives can be duplicated (“mirrored”) to preserve data. RAID systems may vary in levels of redundancy, with no redundancy being a single, non-mirrored disk as level 0, two disks that mirror each other as level 1, on up to level 5, the most common.
Recognition
Technologies that allow paper information to be translated to electronic data without manual data input. Recognition technologies have progressive capabilities from optical character recognition (OCR) to intelligent character recognitions (ICR) and are important for converting large amounts of forms or unstructured data to usable information in a content management system.
Record
Any piece of information created or received and maintained by an organization or person in the course of their business or conduct of affairs and kept as evidence of such activity.
Records Management
Content of long-term business value are deemed records and managed according to a retention schedule that determines how long a record is kept based on either outside regulations or internal business practices. Any piece of content can be designated a record.
Records Management
The function of managing records to meet organizational needs, business efficiency and legal and financial accountability.
Redaction
A type of document annotation that provides word-level security by concealing from view specific portions of sensitive documents. Like all annotations in a document imaging system, redactions should be image overlays that protect information but do not alter original document images.
RIM (Records and Information Management)
A term typically used by corporations that represents the administration of all business records throughout their life cycle. It represents the management of contracts, memos, paper and electronic files, marketing materials, reports, emails and instant message logs, website content, database records and other documents across the entire organization.
Retention Schedule
A schedule that details the categories of records an organization is required to store. It outlines the length of time different categories of records should be stored, and when they can be deleted.
SaaS (Software as a Service)
Also referred to as a Cloud, SaaS is the delivery of software over the Internet from external suppliers on a pay-per-use model.
Scalability
The capacity of a system to expand without requiring major reconfiguration or re-entry of data. Multiple servers or additional storage can be easily added.
Scanner
An input device commonly used to convert paper documents into computer images. Special scanners are available to capture large format documents, typically up to A0 size, transparent originals such as microforms, and bound material such as books.
SCSI
Pronounced “skuzzy.” A standard for attaching peripherals (notably mass storage devices and scanners) to computers. SCSI allows for up to 7 devices to be attached in a chain via cables. SCSI interfaces provide for faster data transmission rates than standard serial and parallel ports.
Server
A computer dedicated to serving other computers. Common server applications are the central storage of files which are in shared usage and control of system peripherals such as printers which are shared by a group of users.
Service Bureau
A company specializing in the provision of micrographic or electronic imaging services under contract.
SSL (Secure Sockets Layer)
A protocol for transmitting private documents securely over the Internet.
Thumbnails
Small versions of an image used for quick overviews or to get a general idea of what an image looks like.
TIFF
Tagged Image File Format. A non-proprietary raster image format, in wide use since 1981, which allows for several different types of compression. TIFFs may be either single or multi-page files. A single-page TIFF is a single image of one page of a document. A multi-page TIFF is a large single file consisting of multiple document pages.
Thresholding
Image processing technique which defines whether a scanned pixel should be considered black or white. Commonly used to drop out background colors in order to clarify the textual content or line work of a document.
TWAIN
TWAIN is a widely-used program that lets you scan an image (using a scanner ) directly into the application (such as PhotoShop) where you want to work with the image. Without TWAIN, you would have to close an application that was open, open a special application to receive the image, and then move the image to the application where you wanted to work with it. The TWAIN driver runs between an application and the scanner hardware. TWAIN usually comes as part of the software package you get when you buy a scanner. It's also integrated into PhotoShop and similar image manipulation programs.
Workflow
A process by which documents can be moved around a multi-user imaging system on an “as-needed” basis. A programmed series of automated steps that route documents to various users on a multi-user imaging system.
Zonal OCR
An add-on feature of the imaging software that populates document templates by reading certain regions or zones of a document, and then placing the text into a document index field.
Zoom
To enlarge a portion of an image to view it more clearly.
|