Word to Html of does not require WORD components

  • 2020-06-03 06:14:36
  • OfStack

Basic ideas:
Upload the Word file to the server, read its content stored as Html, and then load the Html content

1: Use the ES6en.Office.Interop.Word component
This is one of the more common ways to do it, and I'm not going to post a bunch of examples of it on the web
Disadvantages: The server needs to install components of Word, and the permissions of Docm+ objects need to be set on the server. If one server is ok, if the project is applied to multiple different servers, it will be tedious
2: OpenXml API
You can convert.docx (word 97-2003 not applicable) to XML. With XML, it is no longer a problem to convert to HTML or any other format. This Api FreamWork3.5+ Office2007+
3: Party 3: Aspose. Words (tested, recommended)
Aspose provides a variety of format conversion scheme, interested can go in to have a closer look 1, NET Java direction, Aspose.Words this Dll, without installing Microsoft Office components can convert Word (Converting DOC,DOCX to MS Office Word Net)

Aspose.Words.Document d = new Aspose.Words.Document(wordPhysicalPath);
            d.Save("d:\\1.html", SaveFormat.Html);

It can be saved as HTML documents (note that the images in Word are stored in the same directory as Html and need to be replaced when reading Html content < img src = 'https: / / www. ofstack. com CWolf/archive / 2011/09/30 / for < img src='+ image virtual path)
Pros: You don't need to install Microsoft Office components. You only need an DLL of approximately 2M to do this
Disadvantages: Aspose is not an open source component. Although there is a cracked version in China, it can also be decomcompiled and changed by itself, but the copyright issue is really a factor to be considered

There are also some other items of the third party, most of which are charged, which are not listed here

Related articles: