Tom Jebo
Sr Escalation Engineer
Transform Open XML Documents with Open XML
SDK, Azure Functions and Microsoft Flow
India 2014
Frustration!
What if…we could automate?Goals• Allow presenters to upload their slide deck
• Automatically process the deck to change the theme
What do we need?• A repository for the decks
• An understanding of Open XML File Format
• A way to programmatically transform a deck
• A way to automate the process
Overview to Open XML File Format Markup Languages
Open XML SDK
Demo: Transforming a Presentation
Tools, Support & Resources
Agenda
Brief history of Office Open XML
2000
First XML based format used by
OfficeXP
2003
Microsoft Office XML format
released in Office 2003
2005/6
Office Open XML submitted to
ECMA Int’l
2007
Office 2007 makes OOXML
default file format
2008
ISO/IEC 29500:2008 published
Support by Office Version
First to support OOXML, default
format
ECMA-376 support
ISO/IEC 29500 “Transitional” r/w
ISO/IEC 29500 “Strict” read
ECMA-376 read
ISO/IEC 29500 “Strict” r/wi.e. File | Save As… and choose “Strict Open XML Presentation (*.pptx)
“Strict” r/w
Get the standard
iso 29500 download
Parts of the ISO 29500 Standard
Parts
Reference
Part 1MarkupFundamentals
Packaging
Model
Part 2Open Packaging
MCE elements
and attributes
Part 3Compatibility/Extensibilit
y
Namespaces
and elements
in transitional
markup
Part 4Transitional
WordprocessingML, SpreadsheetML, PresentationML & DrawingML
• Parts and elements to get started:
• WordprocessingML– 11.3.10 (document.xml)
– 17.2 and 17.3 (body, paragraphs and runs)
• SpreadsheetML– 12.3.24 (sheet<x>.xml)
– 18.3 (Worksheet elements)
• PresentationML– 13.3.8 (slide<x>.xml)
– 19.3 (slide elements)
• Schemas: – Annex A – W3C XML1
– Annex B – RELAX NG9
• Primer– Annex L – detailed intro to the ML’s. Use this!
<w:document<w:body><w:p w14:paraId="2673269E" w14:textId="522FD3EB" w:rsidR="00BD0355" w:rsidRDefault="00BD0355">
<w:r><w:t xml:space="preserve">This is a run of text. It is part of a paragraph.
</w:r></w:p>
<w:p w14:paraId="43453C87" w14:textId="3F744A5F" w:rsidR="00BD0355" w:rsidRDefault="00BD0355"><w:r>
<w:t>Here is another run of text in a new paragraph.</</w:r>
</w:p>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml<dimension ref="A1:B4"/><sheetData>
<row r="1" spans="1:2" x14ac:dyDescent="0.25"><c r="A1">
<v>1</v></c>
</row><row r="2" spans="1:2" x14ac:dyDescent="0.25">
<c r="A2"><v>2</v>
</c></row><row r="4" spans="1:2" x14ac:dyDescent="0.25">
<c r="B4" t="s"><v>1</v>
</c>
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" <p:cSld>
<p:spTree><p:sp>
<p:nvSpPr><p:cNvPr id="2" name="Title 1"/><p:cNvSpPr>
<a:spLocks noGrp="1"/></p:cNvSpPr>
</p:nvSpPr><p:spPr/><p:txBody><a:p>
<a:r><a:rPr lang="en-US" dirty="0" smtClean<a:t>Fancy art from the internet</a:t
</a:r>
Supporting Open Specifications
[MS-OI29500]Notes for Microsoft Office products implementing
ISO/IEC 29500
[MS-ODRAWXML]
[MS-DOCX]
[MS-XLSX]
[MS-PPTX]Extensions to the DrawingML, WordprocessingML, SpreadsheetML and PresentationML standard elements defined in ISO/IEC 29500
See reference note for URL to MSDN page for these.10
Open XML SDK on Github
Open XML SDK on Github
Github
Fork, build, modify
Nuget package
Release and latest builds
Issues
Report bugs
Contribution
Enhancements, fixes
SDK Classes
OOXMLSDK API
Part API DOM APIFramework
APISchema
validationSemantic validation
DocumentFormat.OpenXml.dll
1. Generate OpenXmlPart
API Source Code
2. Build Framework
API
3. Process Schema Files
4. Generate DOM API
Source Code
5. Generate Schema
Validation Data
6. Generate Semantic Validation
Data
7. Build DocumentFormat.OpenXMLDLL
SDK Class generation process
Common Scenarios for each Markup Language
Generation Extraction Transform
WordprocessingML
SpreadsheetML
PresentationML
Best Practices and Getting Started
• XSL Transformation using Flat OPC
• Use Templates (Office)
• Use Reflected code and modify (Productivity Tool)
• Use LINQ to find target elements and parts (.Net)
Real World Usage
• Microsoft Open Specifications!
• Microsoft internal/public
• Many major organizations
Putting it all together:
High-performance extraction, modification and generation of
Demo: Transform a Presentation using Azure Functions and Flow
1
3
2
4
Upcoming Sessions
Data Portability on the Cloud with the Office Open XML SDK
Tools
Open XML Package Editor for Visual Studio:https://github.com/OfficeDev/Open-XML-Package-Editor-Power-Tool-for-Visual-Studio
OOXML Tools Extension for Chrome
(search “ooxml tools chrome” and install in Chrome)
Open XML SDK Productivity Tool
(search “open xml sdk 2.5”, click download, OpenXMLSDKToolV25.msi)
Support & Resources
SDK
Open XML SDK
https://github.com/OfficeDev/Open-Xml-Sdk
OpenXMLDeveloper
http://www.openxmldeveloper.org
libopc
http://libopc.codeplex.com (third-party open source OOXML library)
Transforming Open XML Documents using XSLThttps://blogs.msdn.microsoft.com/ericwhite/2008/09/29/transforming-open-xml-documents-using-xslt/
Azure Functions
https://docs.microsoft.com/en-us/azure/azure-functions/
Open Specifications
https://social.msdn.microsoft.com/Forums/en-US/home?category=openspecifications