KhemeiaTM automates the entire transformation process and delivers much more than that. Using Artificial Intelligence techniques, KhemeiaTM systematically extracts and semantically tags meta-data, it structures and hierarchically organizes information, generates Table of Contents and converts them to XML-based outputs – all in real-time.

KhemeiaTM creates structured content from Paper and PDF, Word, ASCII, OCR (Optical Character Recognition), RTF, Excel, CSV, SGML, QuarkExpress, Adobe InDesign and HTML.

Khemeia’s 4 step transformation process.

Detection of content elements in a class of documents as defined in the customer DTD (Document Type Definition) or XML Schema,for example: Section titles, Numbers, Header, Paragraphs, Hyperlinks, Tables, Graphics.

Content elements extracted are semantically tagged – Section titles (court name), Header (case name), Numbers (page numbers), Paragraphs (alinea), Tables (Evidence List).

This involves: Splitting the document into relevant modules, Create the hierarchy, Format bullet lists, Generate Table of Contents, Create linkages.

Output types XML, PDF, HTML, DITA, JPEG, XMP, NITF, NewsML, S1000D, Customer-specific.