Unified and efficient search among all your data and documents

ColumbusDoc is
compatible
fast
reliable
flexible
easy

Automatically index and structure your documents to easily and immediately find and consult them

columbus doc • Automatically index and structure your documents to easily and immediately find and consult them.

Highlights

Search result in less than 1″
Mail accounts and cloud disks index
Simply performs complex searches
Save searches and share results with other users
Allows questions in natural language

Multiple sources

The system allows the user to transparently search data and documents coming from multiple sources and stored in different formats such as:

Quick consultation

Ability to directly access the specific page of the document containing the information searched, reducing time to open and explore the document, especially the big size ones.

Unique personal container

Possibility of access a single container, inside which all useful information can be searched and accessed for consultation. Both those used in the workplace and possibly those privately available.

Quick search

Search outcome in less than 1 second

Annotation

Simple text annotations
Structured, manual and automatic annotations

Personal

Personal content search space from workplace and private sources

Features

DATA AND DOCUMENTS ACQUISITION

DATA AND DOCUMENTS ACQUISITION

The system acquires information from a variety sources specified by the user (or by the system administrator) complying with the following conditions:

The rightful owner has granted the necessary access authorization
The access is either supported by the source via API interface or by Columbus through a customized connector specifically written and authorized

TRANSFORMATION

TRANSFORMATION

The different documented elements acquired have different formats.

This process uniforms all the acquired items, originally captured in their own format, in order to obtain searchable PDF documents. Depending on the acquired item format, the transformation may or may not include different tasks, such as rendering, scanning, OCR, conversion, export.

OPTIMIZED STORAGE

OPTIMIZED STORAGE

The system carries out an optimized storage of all the processed items by using dedicated storages to maximize the access performance in consultation.

While respecting access privacy and security, all duplicate content issues, often occurring in communications between members of the same organization, are solved (just think an email attachment sometimes present in dozens of elements, but still the same as content).

The system then stores a single copy of any content, identifying each one by hash functions in order to save both space and time for processing any already transformed content;

All the searchable PDF documents are also stored in paged mode: In other words, the system stores a conversion that allows you to consult a single document page avoiding to access the entire document. For example when the item to be searched is only mentioned in a few pages of a one hundred or more pages document the paged storage allows
to immediately locate and access such pages without downloading the entire document.

CLASSIFICATION AND INDEXING

CLASSIFICATION AND INDEXING

The classification process applies a set of automatic algorithms to all the acquired documents; such algorithms are able to extract and annotate structured information coming from both content and source where the document has been extracted.

The Automatic Extraction Algorithm of the information are combinable according to the customer needs: it is possible then to select the algorithms to be installed, how much computational capacity they must absorb, and in which order they must be run. It’s also possible to write and install customized algorithms capable of applying specific logic in the identification and extraction of structured data. You can also install algorithms using external services of artificial intelligence, able to further extend the capabilities of the system automatic analysis.

The set of these metadata is then used to enrich the indexing phase and make the system able to search information assets by combining full-text search methods with structured data ones. Furthermore, structured information is a basic element of the system’s ability to allow exploratory search, by providing different semantic paths to refine the search results.

During the indexing phase, the system also sets the necessary information to ensure security to access the information assets, according to partitioning and sharing mechanisms, able to guarantee each user can only search within the assets he is granted to access.

SEARCH

SEARCH AND CONSULTING

When classification and indexing processes are completed, the system provides different search methods, both functional and architectural.

– Search through columbus client: the dedicated client, available both in desktop mode (Windows PC) and (optionally) also in mobile mode (smartphone / tablet iOs / Android / Windows), provides all the available search functions.

The search is possible by combining full-text elements existing the indexed document text with filters on the several metadata sets extracted from the system.

It’s possible then to refine the obtained result through a faceting system totally based on the extracted metadata. The facet selection allows inclusions and exclusions, representing a powerful and intuitive way to rapidly locate the content of interest.

The system also allows the so-called ‘exploratory search’ which enable you to explore the entire information assets with no initial query input but using only the faceting mechanism. The system architecture allows such features while maintaining the expected performance efficiency.

– Search through the API: the above described search functions can also be integrated with other systems by using the available API and the appropriate security mechanisms, in order to query the system and obtain standard format data to be used for own needs.

SHARING

SHARING

Each user, if properly authorized through the administrator settings, can share his own search results with other users.

In particular, through the ‘smart sharing’ functions, the user can share information by following a functional logic not strictly tied to the typical security mechanisms of ACLs (users and groups). The sharing can either be based on data classification and / or full-text searches results. In this way it is more intuitive and can be based on functional rules capable of sharing over time even new elements matching the set rules. For example, if the user shares all the documents of a given year dealing with a specific topic, should future indexed elements match the same set criteria they will be automatically shared by the system.

Highlights

Search result in less than 1″

Mail accounts and cloud disks index

Simply performs complex searches

Save searches and share results with other users

Allows questions in natural language

Multiple sources

Quick consultation

Unique personal container

Quick search

Annotation

Personal

Features

DATA AND DOCUMENTS ACQUISITION

The system acquires information from a variety sources specified by the user (or by the system administrator) complying with the following conditions:

TRANSFORMATION

TRANSFORMATION

The different documented elements acquired have different formats.

This process uniforms all the acquired items, originally captured in their own format, in order to obtain searchable PDF documents. Depending on the acquired item format, the transformation may or may not include different tasks, such as rendering, scanning, OCR, conversion, export.

OPTIMIZED STORAGE

OPTIMIZED STORAGE

The system carries out an optimized storage of all the processed items by using dedicated storages to maximize the access performance in consultation.

While respecting access privacy and security, all duplicate content issues, often occurring in communications between members of the same organization, are solved (just think an email attachment sometimes present in dozens of elements, but still the same as content).

The system then stores a single copy of any content, identifying each one by hash functions in order to save both space and time for processing any already transformed content;

CLASSIFICATION AND INDEXING

CLASSIFICATION AND INDEXING

The classification process applies a set of automatic algorithms to all the acquired documents; such algorithms are able to extract and annotate structured information coming from both content and source where the document has been extracted.

During the indexing phase, the system also sets the necessary information to ensure security to access the information assets, according to partitioning and sharing mechanisms, able to guarantee each user can only search within the assets he is granted to access.

SEARCH

SEARCH AND CONSULTING

When classification and indexing processes are completed, the system provides different search methods, both functional and architectural.

– Search through columbus client: the dedicated client, available both in desktop mode (Windows PC) and (optionally) also in mobile mode (smartphone / tablet iOs / Android / Windows), provides all the available search functions.

The search is possible by combining full-text elements existing the indexed document text with filters on the several metadata sets extracted from the system.

It’s possible then to refine the obtained result through a faceting system totally based on the extracted metadata. The facet selection allows inclusions and exclusions, representing a powerful and intuitive way to rapidly locate the content of interest.

The system also allows the so-called ‘exploratory search’ which enable you to explore the entire information assets with no initial query input but using only the faceting mechanism. The system architecture allows such features while maintaining the expected performance efficiency.

– Search through the API: the above described search functions can also be integrated with other systems by using the available API and the appropriate security mechanisms, in order to query the system and obtain standard format data to be used for own needs.

SHARING

SHARING

Each user, if properly authorized through the administrator settings, can share his own search results with other users.