transform open xml documents with open xml sdk, azure ... › uploads...transform open xml documents...
TRANSCRIPT
Transform Open XML Documents with Open XML SDK, Azure Functions and Microsoft Flow
Tom Jebo
Sr Escalation Engineer
India
2014
Frustration!
What if…we could automate?
Goals• Allow presenters to upload their slide deck
• Automatically process the deck to change the theme
What do we need?• A repository for the decks
• An understanding of Open XML File Format
• A way to programmatically transform a deck
• A way to automate the process
Overview to Open XML File Format Markup Languages
Open XML SDK
Demo: Transforming a Presentation
Tools, Support & Resources
Agenda
Brief history of Office Open XML
2000
First XML based
format used by
OfficeXP
2003
Microsoft Office
XML format
released in Office
2003
2005/6
Office Open XML
submitted to
ECMA Int’l
2007
Office 2007
makes OOXML
default file
format
2008
ISO/IEC
29500:2008
published
Support by Office Version
First to support OOXML, default format
ECMA-376 support
ISO/IEC 29500 “Transitional” r/w
ISO/IEC 29500 “Strict” read
ECMA-376 read
ISO/IEC 29500 “Strict” r/w
i.e. File | Save As… and choose “Strict Open XML Presentation (*.pptx)
“Strict” r/w
+
+
Get the standardiso 29500 download
Parts of the ISO 29500 Standard
Parts
Reference
Part 1MarkupFundamentals
Packaging
Model
Part 2Open Packaging
MCE elements
and attributes
Part 3Compatibility/Extensibility
Namespaces
and elements
in transitional
markup
Part 4Transitional
WordprocessingML, SpreadsheetML, PresentationML & DrawingML
• Parts and elements to get started:
• WordprocessingML• 11.3.10 (document.xml)
• 17.2 and 17.3 (body, paragraphs and runs)
• SpreadsheetML• 12.3.24 (sheet<x>.xml)
• 18.3 (Worksheet elements)
• PresentationML• 13.3.8 (slide<x>.xml)
• 19.3 (slide elements)
• Schemas: • Annex A – W3C XML1
• Annex B – RELAX NG9
• Primer• Annex L – detailed intro to the ML’s. Use this!
<w:document<w:body><w:p w14:paraId="2673269E" w14:textId="522FD3EB" w:rsidR="00BD0355" w:rsidRDefault="00BD0355">
<w:r><w:t xml:space="preserve">This is a run of text. It is part of a paragraph.
</w:r></w:p>
<w:p w14:paraId="43453C87" w14:textId="3F744A5F" w:rsidR="00BD0355" w:rsidRDefault="00BD0355"><w:r>
<w:t>Here is another run of text in a new paragraph.</</w:r>
</w:p>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml<dimension ref="A1:B4"/><sheetData>
<row r="1" spans="1:2" x14ac:dyDescent="0.25"><c r="A1">
<v>1</v></c>
</row><row r="2" spans="1:2" x14ac:dyDescent="0.25">
<c r="A2"><v>2</v>
</c></row><row r="4" spans="1:2" x14ac:dyDescent="0.25">
<c r="B4" t="s"><v>1</v>
</c>
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" <p:cSld>
<p:spTree><p:sp>
<p:nvSpPr><p:cNvPr id="2" name="Title 1"/><p:cNvSpPr>
<a:spLocks noGrp="1"/></p:cNvSpPr>
</p:nvSpPr><p:spPr/><p:txBody><a:p>
<a:r><a:rPr lang="en-US" dirty="0" smtClean<a:t>Fancy art from the internet</a:t
</a:r>
Supporting Open Specifications
[MS-OI29500]Notes for Microsoft Office products implementing ISO/IEC 29500
[MS-ODRAWXML]
[MS-DOCX]
[MS-XLSX]
[MS-PPTX]Extensions to the DrawingML, WordprocessingML, SpreadsheetML and PresentationML standard elements defined in
ISO/IEC 29500
See reference note for URL to MSDN page for these.10
Open XML SDK on Github
Open XML SDK on Github
Github
Fork, build, modify
Nuget package
Release and latest builds
Issues
Report bugs
Contribution
Enhancements, fixes
SDK ClassesOOXMLSDK API
Part API DOM APIFramework
API
Schema
validation
Semantic
validation
DocumentFormat.OpenXml.dll
1. Generate
OpenXmlPart
API Source
Code
2. Build
Framework
API
3. Process
Schema Files
4. Generate
DOM API
Source Code
5. Generate
Schema
Validation
Data
6. Generate
Semantic
Validation
Data
7. Build
DocumentFormat.OpenXML DLL
SDK Class generation process
Common Scenarios
Generation Extraction Transform
WordprocessingML
SpreadsheetML
PresentationML
Best Practices and Getting Started
• XSL Transformation using Flat OPC
• Use Templates (Office)
• Use Reflected code and modify (Productivity Tool)
• Use LINQ to find target elements and parts (.Net)
Real World Usage• Microsoft Open Specifications!
• Microsoft internal/public
• Many major organizations
Putting it all together:
High-performance extraction, modification and generation of
Demo: Transform a Presentation using Azure Functions and Flow
1
3
2
4
Upcoming Sessions
Tomorrow:
9:00am Intro to Open Specifications
9:15am Exchange Server Protocol Overview 2019
9:45am-1:15pm
MAPI, FSSHTTP & WOPI Roundtables
1:15-4:45pm
SQL Server 2019 Protocol Overview
SQL Server Remote Storage (SMB)
SQL Server 2019 Big Data Cluster Overview
Introducing Azure SQL DB Edge
Tools
Open XML Package Editor for Visual Studio:
https://github.com/OfficeDev/Open-XML-Package-Editor-Power-Tool-for-Visual-Studio
OOXML Tools Extension for Chrome
(search “ooxml tools chrome” and install in Chrome)
Open XML SDK Productivity Tool
(search “open xml sdk 2.5”, click download, OpenXMLSDKToolV25.msi)
Support & ResourcesSDK
Open XML SDKhttps://github.com/OfficeDev/Open-Xml-Sdk
OpenXMLDeveloperhttp://www.openxmldeveloper.org
libopchttp://libopc.codeplex.com (third-party open source OOXML library)
Transforming Open XML Documents using XSLThttps://blogs.msdn.microsoft.com/ericwhite/2008/09/29/transforming-open-xml-documents-using-xslt/
Azure Functionshttps://docs.microsoft.com/en-us/azure/azure-functions/
Open Specifications
https://social.msdn.microsoft.com/Forums/en-US/home?category=openspecifications