Table of Contents
If you're working with ASP.NET C# and you need to open, edit or otherwise access a Microsoft Word DOC or DOCX file, you can easily do that using the Microsoft.Office.Interop.Word library package. This post explains how to do so: you might find it useful in case you need to perform such task or whenever you want to read some insights regarding the process.
Introducing Interop.Word
To access the namespace from your ASP.NET project you have two main choices:
- Install the official Microsoft Office primary interop assemblies (PIAs) package on your machine by downloading and executing the runtime installer, then manually add a Reference to the Microsoft.Office.Interop.Word.dll file.
- Install the appropriate NuGet package within Visual Studio using the Package Manager Console.
Needless to say, you should really go for the second option, but we'll leave that to you.
Working with the Document
As soon as you have the namespace available, you can do the following:
1 2 3 4 5 6 7 |
// NS alias to avoid writing the required namespace all the time using word = Microsoft.Office.Interop.Word; // [...] Application app = new word.Application(); Document doc = app.Documents.Open(filePath); |
Once you have the app and the doc objects you can perform a lot of editing task, such as:
Find and Replace Text
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
var textToFind = "any source text"; var textToReplace = "any replacement text"; var matchCase = true; var matchWholeWord = true; var matchWildcards = false; var matchSoundsLike = false; var matchAllWordForms = false; var forward = true; var wrap = 1; var format = false; var replace = 2; app.Selection.Find.Execute( textToFind, matchCase, matchWholeWord, matchWildcards, matchSoundsLike, matchAllWordForms, forward, wrap, format, textToReplace, replace); |
Find and replace Bookmarks
1 2 3 4 5 6 7 |
var bookmarkName = "anyName"; var bookmarkNewValue = "anyValue"; if (doc.Bookmarks.Exists(bookmarkName)) { doc.Bookmarks[bookmarkName].Select(); app.Selection.TypeText(bookmarkNewValue); } |
Convert a DOC / DOCX file to PDF
Surprisingly enough, we can even do that with an one-liner thanks to the native "Save As PDF..." feature introduced with Office 2010.
1 |
doc.SaveAs2("path-to-pdf-file.pdf", word.WdSaveFormat.wdFormatPDF); |
Export a DOC / DOCX file into a PDF
This one is almost identical to the previous one in terms of results.
1 |
doc.ExportAsFixedFormat(tmpFile, WdExportFormat.wdExportFormatPDF); |
... and so on.
For additional info regarding word-to-pdf conversion, you can also read this dedicated post: otherwise, keep reading.
From a Byte Array
What if you have the DOC or DOCX file stored outside the FileSystem, such as in blob-format within a Database? If that's the case you need to use a temporary file, because most Office Interop methods do not support working with byte arrays, streams and so on.
Here's a decent workaround you can use:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// byte[] fileBytes = getFileBytesFromDB(); var tmpFile = Path.GetTempFileName(); File.WriteAllBytes(tmpFile, fileBytes); Application app = new word.Application(); Document doc = app.Documents.Open(filePath); // .. do your stuff here ... doc.Close(); app.Quit(); byte[] newFileBytes = File.ReadAllBytes(tmpFile); File.Delete(tmpFile); |
You might notice that we used the
Close()method in order to close (and thus save) the file. In case you wan't to save your changes to the DOC / DOCX file you opened, you need to explicitly say it by adding the
WdSaveOptions.wdDoNotSaveChangesobject parameter in the following way:
1 |
doc.Close(word.WdSaveOptions.wdDoNotSaveChanges); |
IMPORTANT: Do not underestimate the call to
app.Quit()! If you don't do that, the MS Word instance will be left open on your server (see this thread on StackOverflow for more info on that issue). If you want to be sure to avoid such dreadful scenario entirely you should strengthen the given implementation adding a try/catch fallback strategy such as the follow:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Application app = null; Document doc = null; try { app = new word.Application(); doc = Document doc = app.Documents.Open(filePath); // .. do your stuff here ... doc.Close(); app.Quit(); } catch (Exception e) { if (doc != null) doc.Close(); if (app != null) app.Quit(); } |
Unfortunately these objects don't implement IDisposable, otherwise it would've been even easier.
That's pretty much it: happy coding!