login/register

Snip!t from collection of Alan Dix

see all channels for Alan Dix

Snip
summary

Docvert takes word processor files (typically .doc) and ...
Web Service receives .doc file and converts it to a Open ...
The resulting OpenDocument is then optionally converted ...
The result is returned in a .zip file.
Docvert has a user-friendly inter

Docvert - Microsoft Word to Open Standards [current 4.0]
http://holloway.co.nz/docvert/index.html

Categories

/Channels/techie/web development

[ go to category ]

/Channels/text mining

[ go to category ]

For Snip

loading snip actions ...

For Page

loading url actions ...

Docvert takes word processor files (typically .doc) and converts them to OpenDocument and clean HTML.

Web Service receives .doc file and converts it to a OpenDocument (ODF) which can then be converted to HTML, DocBook, RSS, or any XML format.

The resulting OpenDocument is then optionally converted to HTML or any XML. This is done with XML Pipelines, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (e.g., chapter1.html, chapter2.html…).

The result is returned in a .zip file.

Docvert has a user-friendly interface, and it's easy to integrate with other software as it uses a simple REST-style interface. It's released under the GPL v3 so although it's Free Software there's no legal problems developing proprietary software ontop of the Web Service interface. The XML produced is easier to understand and more structured than the OOXML or .DOC formats.

HTML

<p style="margin-top: 20px;">Docvert takes word processor files (typically .doc) and converts them to <a href="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office" class="external">OpenDocument</a> and clean <acronym title="Hypertext Markup Language">HTML</acronym>. </p> <div class="diagram"> <p> <span>Web Service receives .doc file</span> <span style="margin-left: 50px;"><img src="arrow-bottomright.gif" alt="and converts it to a "></span> <span style="margin-left: 30px;">OpenDocument (ODF)</span> <span style="margin-left: 160px;"><img src="arrow-bottomright2.gif" alt="which can then be converted to"></span> <span style="margin-left: 0px;">HTML, DocBook, RSS, or <em>any XML format</em>.</span> </p> </div> <p> The resulting OpenDocument is then optionally converted to HTML or any XML. This is done with <a href="faq.html#xml-pipelines"><em>XML Pipelines</em></a>, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (<abbr title="for example">e.g.</abbr>, chapter1.html, chapter2.html&#x2026;). </p> <p> The result is returned in a .zip file. </p> <p> Docvert has a user-friendly interface, and it's easy to integrate with other software as it uses a simple <a href="faq.html#rest-web-services">REST</a>-style interface. It's released under the <a href="http://www.gnu.org/licenses/gpl-3.0.html" class="external">GPL <abbr title="version">v</abbr>3</a> so although it's Free Software there's no legal problems developing proprietary software ontop of the Web Service interface. The XML produced is easier to understand and more structured than the <abbr title="Microsoft XML Word">OOXML</abbr> or .DOC formats.</p>