Convert easily Docx to HTML in your Java applications
Sferyx JSyndrome DocxHTML Converter Component Edition
Advanced Java Docx to HTML Converter component - convert easily Microsoft Word Docx files to HTML in Java
Sferyx DocxHTMLConverter Component is an advanced and powerful Java Docx to HTML Converter and Generator component. It can convert easily Microsoft Word Docx to HTML in any Java Application - Java Swing, JavaFX, SWT Eclipse and also Oracle Forms and produces perfectly paginated documents preserving the formatting including the page breaks, headers, footers and the page numbers. With only a few lines of Java code is possible to generate complex HTML files from almost any Word Docx source or location and the resulting HTML can be written to a local file, java.io.OutputStream or shown automatically inside the browser. The Docx to HTML Converter Java component supports all UTF-8 languages including support for Greek, Arabic, Cyrillic, Hebrew, Farsi, Chinese, Japanese, Hindi, Tamil and more. The Sferyx Docx to HTML Converter component is ready for use out of the box and does not depend on external packages.
You can create the files dynamically by adding the content on-the-fly directly inside your application and also generate page breaks when needed.
Trusted Code Signing Security Certificate from Thawte
Version 23.0
Sferyx JSyndrome DocxHTML Converter Component Edition : Download DocxHTMLConverterDemo.zip
- Pure Java Docx to HTML Generation engine - allows fast and easy HTML creation from various sources and also convert even very complex Docx documents with single line of code - 100% in house development - it does not depend on external packages.
- Converts and generates quickly and easily HTML files directly from Microsoft Word Docx documents
- Royalty free redistribution with your applications
- Inclusion of all images including the inline Base64 encoded images, inline and linked CSS styles etc.
- Works with any JRE/ JDK 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 9, 10, 11, 12, 13, 14, 15, 16 or higher
- Support for Oracle Forms and full generation of HTML from Docx from Oracle Forms and CLOB
- Fully compatible with Java Swing, JavaFX, SWT Eclipse, Oracle Forms, Java Servlets, JSP
- Compatible with Headless mode for server systems
- Compact size and fast document generation
- Now all hyperlinks inside the Docx document are generated as links (annotations) automatically into the resulting HTML file
- Support for disabling the table breaking across multiple pages
- Support for disabling lists breaking across multiple pages
- Support for the CSS page break elements page-break-before:always, page-break-after:always, page-break-inside:never
Example usage
The use of the
DocxHTMLConverter component is quite simple - with only a few
lines of code is possible to generate and convert practically
any Docx document to HTML.
Here are some examples on how to convert Microsoft Word Docx to HTML in Java with the Sferyx DocxHTMLConverter:
Convert Word Docx URL to HTML file
This method will convert the Docx to HTML and will save it to given file. The
destinationFile parameter is a
java.io.File object:
DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter ();
docxHTMLConverter.generateHTMLFromDocxURL ("http://your_url_here.docx", destinationFile);
or using the file name as String:
DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter ();
docxHTMLConverter.generateHTMLFromDocxURL ("http://your_url_here.docx", "c:/docxgenerator-test1.html");
Convert Word Docx URL to HTML OutputStream
It will convert the specified Docx document to HTML using the standard page format string such as "A4", "Letter" etc. and and save it to the specified OutputStream. This method will recognize automatically if the document is Docx file and will convert it accordingly. To use this automatic conversion the URL must end with the corresponding extension like docx.
Convert Word Docx URL to HTML with different Page Format dialog options
Converts
automatically the Docx URL to HTML and generates the file
using the File dialog options. It will display
File dialog for saving the generated
file. This method will recognize automatically if the document
is Docx and will convert it accordingly. To
use this automatic conversion the URL must end with the
corresponding extension like docx.
Dynamically Generate HTML from Word Docx and convert multiple files in Java with the Sferyx DocxHTML Converter
You can generate even very complex
HTML documents dynamically in your Java application by simply providing all the formatting
in HTML and inserting page breaks when new pages are needed - the HTML Generator will take care automatically for all the pagination
of long
formatted text spanning through multiple pages and also tables, lists etc.
This functionality is perfect for creating various reports and other
documents which need to be generated dynamically with rich text formatting.
You can insert dynamically also Docx files which will be converted automatically to HTML or add other HTML content along with the Docx files, images which will be embedded as base64 encoded Strings inside the HTML document etc. - everything will be converted automatically and inserted as HTML in the same document.
DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter();
//Open the content buffer to insert the content - HTML, Docx etc - everything can be merged together.
docxHTMLConverter.openContentBuffer();
//Append the content to the content buffer - you can insert styles, images and any kind of formatting.
docxHTMLConverter.appendHTMLContentToContentBuffer("<style>body{font-size:12pt;color:blue;} h1{background-color:yellow;}</style>");
docxHTMLConverter.appendHTMLContentToContentBuffer("<h1>This is H1 header</h1>Some other text <b>very important <i>stuff</i></b> with page break after");
//Insert page break to create new page - the HTMLGenerator will handle automatically all the pagination for long text if more pages are needed, tables and everything.
docxHTMLConverter.addPageBreakToContentBuffer();
//Append the content for the new page.
docxHTMLConverter.appendHTMLContentToContentBuffer("<h2 style=\"background-color:green;border-bottom:1px solid red;color:white\">This is second H2 header</h2>Some other text <span style=\"color:orange\">extremely interesting <u>stuff</u></b></span><br>");
//Insert another page break...
docxHTMLConverter.addPageBreakToContentBuffer();
....
//Append MS Word Docx file directly to the content buffer and it will be converted to HTML in the same document
docxHTMLConverter.appendDocxToContentBuffer(new java.net.URL("file:///c:/test/demo.docx"));
...
//Append another MS Word Docx file directly to the content buffer and it will be converted to HTML in the same document
docxHTMLConverter.appendDocxToContentBuffer(new java.net.URL("file:///c:/test/Sample06-1.docx"));
.....
//Close the content buffer and create the HTML document - there is a possibility to write it to File, OutputStream etc.
docxHTMLConverter.closeBufferAndGenerateHTML("c:/test/dynamic.html");
Command line arguments for the DocxHTMLConverter.jar file
You can easily execute the DocxHTMLConverter.jar from the command line and perform document conversions without writing code using the available command line arguments as follows:
java -jar DocxHTMLConverter.jar absolute_url destination_file
Example:
C:\test>java -jar "C:\test\DocxHTMLConverter.jar" http://your_url_here
c:/test/test-html.html
Methods available in the sferyx.administration.htmlgenerator.DocxHTMLConverter class
Method Summary | |
---|---|
void |
addPageBreakToContentBuffer() Adds a HTML page break to the content buffer and all the content appended after that will be on the next page when printed |
void |
appendDocxToContentBuffer(java.io.File file) Appends the whole content of the Docx file from the File to the content buffer. |
void |
appendDocxToContentBuffer(java.net.URL file) Appends the whole content of the Docx file from the given URL to the content buffer. |
void |
appendHTMLContentToContentBuffer(java.lang.String content) Appends new HTML string to existing content buffer. |
void |
clearContentBuffer() Closes the content buffer and clears the content. |
String |
closeBufferAndGenerateHTML() Generates the HTML content automatically for given content buffer created prevuiously by using openContentBuffer() and appendContentXXX() methods. |
void |
closeBufferAndGenerateHTML(java.io.OutputStream destinationStream) Closes the existing content buffer and generates the resulting content from the DocxHTML Converter - it will be saved in the given OutputStream. |
void |
closeBufferAndGenerateHTML(java.lang.String destinationFile) Generates the HTML content automatically for given content buffer created prevuiously by using openContentBuffer() and appendContentXXX() methods. |
String |
generateHTMLFromContent(java.lang.String content) Generates HTML automatically for given image or HTML content. |
void |
generateHTMLFromContent(java.lang.String content,
java.io.File destinationFile) Generates html automatically for given html content. |
void |
generateHTMLFromContent(java.lang.String content,
java.io.OutputStream destinationStream) Generates HTML automatically for given image or html content. |
void |
generateHTMLFromContent(java.lang.String content,
java.lang.String destinationFile) Generates the HTML automatically for given html content. |
String |
generateHTMLFromDocxURL(java.lang.String sourceURL) Generates HTML automatically for given URL source containing a MS Word Docx file. |
void |
generateHTMLFromDocxURL(java.lang.String sourceURL,
java.io.File destinationFile) Generates HTML automatically for given URL source containing a MS Word Docx file. |
void |
generateHTMLFromDocxURL(java.lang.String sourceURL,
java.lang.String destinationFile) Generates HTML automatically for given URL source containing a MS Word Docx file. |
String |
generateHTMLFromDocxURL(java.net.URL sourceURL) Generates HTML automatically for given URL source containing a MS Word Docx file. |
void |
generateHTMLFromDocxURL(java.net.URL sourceURL,
java.io.File destinationFile) Generates HTML automatically for given URL source containing a MS Word Docx file. |
void |
generateHTMLFromDocxURL(java.net.URL sourceURL,
java.io.OutputStream fos) Generates HTML automatically for given URL source containing a MS Word Docx file. |
void |
generateHTMLFromURL(java.lang.String sourceURL) Generates HTML automatically for given URL source. |
void |
generateHTMLFromURL(java.lang.String sourceURL,
java.io.File destinationFile) Generates HTML automatically for given URL source and saves the result to destinationFile as string. |
void |
generateHTMLFromURL(java.lang.String sourceURL,
java.io.OutputStream destinationStream) Generates HTML automatically for given URL source and saves the result to the given OutputStream as a string. |
void |
generateHTMLFromURL(java.lang.String sourceURL,
java.lang.String destinationFile) Generates HTML automatically for given URL source and saves the result to destinationFile as a string. |
String |
generateHTMLFromURL(java.net.URL sourceURL) Generates HTML automatically for given URL source and saves the result will be returned as a String. |
void |
generateHTMLFromURL(java.net.URL sourceURL,
java.io.OutputStream destinationStream) Generates HTML automatically for given URL source and saves the result to destinationStream as string. |
void |
openContentBuffer() Opens the new content buffer for inserting content to be used for dynamic HTML generation. |
Customers
Sferyx customer base counts more than 1000 corporate customers and institutions from over 40 countries and different industrial sectors as follows: Media and publishing companies, Internet Service Providers, Research Labs, Fortune 500 companies, Universities, Colleges and Schools, Software Developers, Content Management Systems developers, Web design agencies.
More than 1000 corporate customers, among them:
|
Home
| Users Manual | License | Demo
& Download
Copyright © 2001-2024 Sferyx
Srl. All rights reserved. Sferyx and the Sferyx logo are registered trademarks
of Sferyx Srl. http://www.sferyx.com