transform open xml documents with open xml sdk, azure ... · open xml sdk demo: transforming a...

Post on 12-Oct-2020

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tom Jebo

Sr Escalation Engineer

Transform Open XML Documents with Open XML

SDK, Azure Functions and Microsoft Flow

India 2014

Frustration!

What if…we could automate?Goals• Allow presenters to upload their slide deck

• Automatically process the deck to change the theme

What do we need?• A repository for the decks

• An understanding of Open XML File Format

• A way to programmatically transform a deck

• A way to automate the process

Overview to Open XML File Format Markup Languages

Open XML SDK

Demo: Transforming a Presentation

Tools, Support & Resources

Agenda

Brief history of Office Open XML

2000

First XML based format used by

OfficeXP

2003

Microsoft Office XML format

released in Office 2003

2005/6

Office Open XML submitted to

ECMA Int’l

2007

Office 2007 makes OOXML

default file format

2008

ISO/IEC 29500:2008 published

Support by Office Version

First to support OOXML, default

format

ECMA-376 support

ISO/IEC 29500 “Transitional” r/w

ISO/IEC 29500 “Strict” read

ECMA-376 read

ISO/IEC 29500 “Strict” r/wi.e. File | Save As… and choose “Strict Open XML Presentation (*.pptx)

“Strict” r/w

Get the standard

iso 29500 download

Parts of the ISO 29500 Standard

Parts

Reference

Part 1MarkupFundamentals

Packaging

Model

Part 2Open Packaging

MCE elements

and attributes

Part 3Compatibility/Extensibilit

y

Namespaces

and elements

in transitional

markup

Part 4Transitional

WordprocessingML, SpreadsheetML, PresentationML & DrawingML

• Parts and elements to get started:

• WordprocessingML– 11.3.10 (document.xml)

– 17.2 and 17.3 (body, paragraphs and runs)

• SpreadsheetML– 12.3.24 (sheet<x>.xml)

– 18.3 (Worksheet elements)

• PresentationML– 13.3.8 (slide<x>.xml)

– 19.3 (slide elements)

• Schemas: – Annex A – W3C XML1

– Annex B – RELAX NG9

• Primer– Annex L – detailed intro to the ML’s. Use this!

<w:document<w:body><w:p w14:paraId="2673269E" w14:textId="522FD3EB" w:rsidR="00BD0355" w:rsidRDefault="00BD0355">

<w:r><w:t xml:space="preserve">This is a run of text. It is part of a paragraph.

</w:r></w:p>

<w:p w14:paraId="43453C87" w14:textId="3F744A5F" w:rsidR="00BD0355" w:rsidRDefault="00BD0355"><w:r>

<w:t>Here is another run of text in a new paragraph.</</w:r>

</w:p>

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml<dimension ref="A1:B4"/><sheetData>

<row r="1" spans="1:2" x14ac:dyDescent="0.25"><c r="A1">

<v>1</v></c>

</row><row r="2" spans="1:2" x14ac:dyDescent="0.25">

<c r="A2"><v>2</v>

</c></row><row r="4" spans="1:2" x14ac:dyDescent="0.25">

<c r="B4" t="s"><v>1</v>

</c>

<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" <p:cSld>

<p:spTree><p:sp>

<p:nvSpPr><p:cNvPr id="2" name="Title 1"/><p:cNvSpPr>

<a:spLocks noGrp="1"/></p:cNvSpPr>

</p:nvSpPr><p:spPr/><p:txBody><a:p>

<a:r><a:rPr lang="en-US" dirty="0" smtClean<a:t>Fancy art from the internet</a:t

</a:r>

Supporting Open Specifications

[MS-OI29500]Notes for Microsoft Office products implementing

ISO/IEC 29500

[MS-ODRAWXML]

[MS-DOCX]

[MS-XLSX]

[MS-PPTX]Extensions to the DrawingML, WordprocessingML, SpreadsheetML and PresentationML standard elements defined in ISO/IEC 29500

See reference note for URL to MSDN page for these.10

Open XML SDK on Github

Open XML SDK on Github

Github

Fork, build, modify

Nuget package

Release and latest builds

Issues

Report bugs

Contribution

Enhancements, fixes

SDK Classes

OOXMLSDK API

Part API DOM APIFramework

APISchema

validationSemantic validation

DocumentFormat.OpenXml.dll

1. Generate OpenXmlPart

API Source Code

2. Build Framework

API

3. Process Schema Files

4. Generate DOM API

Source Code

5. Generate Schema

Validation Data

6. Generate Semantic Validation

Data

7. Build DocumentFormat.OpenXMLDLL

SDK Class generation process

Common Scenarios for each Markup Language

Generation Extraction Transform

WordprocessingML

SpreadsheetML

PresentationML

Best Practices and Getting Started

• XSL Transformation using Flat OPC

• Use Templates (Office)

• Use Reflected code and modify (Productivity Tool)

• Use LINQ to find target elements and parts (.Net)

Real World Usage

• Microsoft Open Specifications!

• Microsoft internal/public

• Many major organizations

Putting it all together:

High-performance extraction, modification and generation of

Demo: Transform a Presentation using Azure Functions and Flow

1

3

2

4

Upcoming Sessions

Data Portability on the Cloud with the Office Open XML SDK

Tools

Open XML Package Editor for Visual Studio:https://github.com/OfficeDev/Open-XML-Package-Editor-Power-Tool-for-Visual-Studio

OOXML Tools Extension for Chrome

(search “ooxml tools chrome” and install in Chrome)

Open XML SDK Productivity Tool

(search “open xml sdk 2.5”, click download, OpenXMLSDKToolV25.msi)

Support & Resources

SDK

Open XML SDK

https://github.com/OfficeDev/Open-Xml-Sdk

OpenXMLDeveloper

http://www.openxmldeveloper.org

libopc

http://libopc.codeplex.com (third-party open source OOXML library)

Transforming Open XML Documents using XSLThttps://blogs.msdn.microsoft.com/ericwhite/2008/09/29/transforming-open-xml-documents-using-xslt/

Azure Functions

https://docs.microsoft.com/en-us/azure/azure-functions/

Open Specifications

dochelp@Microsoft.com

https://social.msdn.microsoft.com/Forums/en-US/home?category=openspecifications

top related