open xml formats fabio santini.net developer evangelist microsoft italy

24

Upload: melvyn-clarke

Post on 18-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy
Page 2: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Open XML FormatsOpen XML Formats

Fabio SantiniFabio Santini.NET Developer Evangelist.NET Developer EvangelistMicrosoft ItalyMicrosoft Italy

Page 3: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

New community formed to help bring New community formed to help bring developers together developers together

Currently sponsored by almost 40 institutions Currently sponsored by almost 40 institutions from around the worldfrom around the world

Community and website for Community and website for information exchangeinformation exchangeFree of Charge: Available to everyone that Free of Charge: Available to everyone that

wants to participate; encourage wants to participate; encourage development on all platformsdevelopment on all platformsBe one of the first to join the community!Be one of the first to join the community!http://openxmldeveloper.orghttp://openxmldeveloper.org

Page 4: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

FF301 AgendaFF301 Agenda

Overview of the new formatsOverview of the new formats

Demo: New Office documentDemo: New Office document

Additional benefits of the new formatsAdditional benefits of the new formats

Role of XML in Office documentsRole of XML in Office documents

Structural details of the new formatsStructural details of the new formats

Custom defined schema supportCustom defined schema support

File format benefits for developersFile format benefits for developers

Solution capabilitiesSolution capabilities

Page 5: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Open XML FormatsOpen XML Formats

New XML file formats for Word, Excel and New XML file formats for Word, Excel and PowerPointPowerPoint

New formats will be the New formats will be the defaultdefault file formats, with new file formats, with new file type extensions (.docx; .pptx; .xlsx)file type extensions (.docx; .pptx; .xlsx)Fully 100% compatible with existing formatsFully 100% compatible with existing formats

Open, transparent format improves Open, transparent format improves interoperabilityinteroperability

XMLXML - Transparent, XML format enables new integration - Transparent, XML format enables new integration scenarios for documents and LOB systemsscenarios for documents and LOB systemsZIP containerZIP container - allows for standard compression on all - allows for standard compression on all files without user effortfiles without user effortLicensingLicensing - Removed need for license by providing a - Removed need for license by providing a Covenant Covenant that says we won’t enforce IP against folks implementing that says we won’t enforce IP against folks implementing the format (100% royalty free)the format (100% royalty free)

StandardizationStandardizationEcma InternationalEcma International - created TC45 to fully document the - created TC45 to fully document the Open XML formatsOpen XML formats

Members include: Apple, Barclays Capital, BP, the British Members include: Apple, Barclays Capital, BP, the British Library, Library, Essilor, Intel Corporation, NextPage Inc., Statoil ASA and Essilor, Intel Corporation, NextPage Inc., Statoil ASA and Toshiba Toshiba Current spec is already over 2000 pagesCurrent spec is already over 2000 pages

Page 6: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Office Open XML Office Open XML FormatsFormats

Page 7: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Microsoft Office Open Microsoft Office Open XML FormatsXML Formats

Added Benefits: Added Benefits: compact and robustcompact and robustZIP container allows for standard compression on all ZIP container allows for standard compression on all files without user effort (Dramatic file size files without user effort (Dramatic file size improvements)improvements)Significantly more robust files to help minimize data Significantly more robust files to help minimize data lossloss

Backward Compatible: Backward Compatible: Office 2000, Office XP, Office 2000, Office XP, Office 2003 will all support the new formatsOffice 2003 will all support the new formats

Patches for compatibility available by launchPatches for compatibility available by launchOpen, edit and save new formatsOpen, edit and save new formats

Legacy support: Legacy support: Current Office 97-2003 binary file Current Office 97-2003 binary file formats supportedformats supported

Support for XML formats from Office 2003, Office XP Support for XML formats from Office 2003, Office XP continuedcontinued

Developers: Developers: Endless potential for developers Endless potential for developers Build solutions to read, write, and modify Office files Build solutions to read, write, and modify Office files (without the need to run Office APIs)(without the need to run Office APIs)

Page 8: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Benefits of the Benefits of the Office Open XML Office Open XML FormatsFormats

Page 9: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Evolution Of File FormatsEvolution Of File Formats

Office 2000Office 2000Early InnovationEarly Innovation

XML document XML document propertiesproperties

Office 97Office 97Existing binary file formats Existing binary file formats designed in 1994, launched designed in 1994, launched in Office 97in Office 97

Office XPOffice XPFirst XML First XML FormatFormat

Spreadsheet Spreadsheet XMLXML

Office 2003Office 2003Breakthrough XML Breakthrough XML SupportSupport

WordML, SpreadsheetMLWordML, SpreadsheetML

Custom-defined schemaCustom-defined schema

Office 2007Office 2007New XML FormatsNew XML Formats

XML file format defaultXML file format default

XML PowerPoint formatXML PowerPoint format

““Wave 12”Wave 12”

Page 10: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

The Role Of XML With The Role Of XML With DocumentsDocumentsScenarioScenario ExampleExampleDocument AssemblyDocument AssemblyServer-based or user-assisted Server-based or user-assisted construction of documents from archived construction of documents from archived content or database contentcontent or database content

Create sales reports from financial and Create sales reports from financial and forecast data stored in a CRM systemforecast data stored in a CRM system

Content ReuseContent ReuseMuch easier to move content between Much easier to move content between documents, including different document documents, including different document typestypes

Apply content stored in Word documents Apply content stored in Word documents to Web pages quickly and efficientlyto Web pages quickly and efficiently

Content TaggingContent TaggingAdd domain-specific metadata to Add domain-specific metadata to document content to enable custom document content to enable custom solutions solutions

Tag presentations using a specific Tag presentations using a specific taxonomy to improve knowledge taxonomy to improve knowledge management efficiency management efficiency

Document InterrogationDocument InterrogationQuery document repositories based on Query document repositories based on custom data, content types or document custom data, content types or document metadatametadata

Search for all documents containing a Search for all documents containing a specific company name or sales contactspecific company name or sales contact

Document SanitizationDocument SanitizationRemove unwanted content like Remove unwanted content like comments or embedded code from your comments or embedded code from your document when appropriatedocument when appropriate

Remove all tracked changes and Remove all tracked changes and comments from a Word document before comments from a Word document before it is publishedit is published

Page 11: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Open XML Formats Open XML Formats ArchitectureArchitecture

User view: single Office “file”

Questionnaire.docx

File ContainerFile Container

Document PropertiesDocument Properties

CommentsComments

ChartsCharts

Embedded code / macrosEmbedded code / macros

Images, video, soundImages, video, sound

Custom-defined XMLCustom-defined XML

WordML / SpreadsheetML, etc.WordML / SpreadsheetML, etc.Document PartsDocument Parts

Most parts are XMLMost parts are XML

Each XML part is a discreet, Each XML part is a discreet, compressed componentcompressed component

Can add, extract and modify Can add, extract and modify individual parts without using individual parts without using Office programsOffice programs

Corruption or absence of any part Corruption or absence of any part would not prohibit the file from would not prohibit the file from being openedbeing opened

Developer view: modular file

Page 12: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Modifying An Modifying An Excel SpreadsheetExcel Spreadsheet

Page 13: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Components Of Components Of The New FormatsThe New Formats

Package – ZIP ContainerPackage – ZIP ContainerPart – The “files” inside the ZIPPart – The “files” inside the ZIPContent Types – Each part has a content type that is Content Types – Each part has a content type that is enforced enforced on openon openRelationships – Any part that references another part Relationships – Any part that references another part must do so via a relationshipmust do so via a relationship

Document Properties

Application Properties

Custom Doc. Props.

Workbook

Sheet 2

Sheet 3

Sheet 1 Styles

Chart

Strings

Relationship

...

...

Page 14: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Create A Document Create A Document From ScratchFrom Scratch

Page 15: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas

Custom-defined Custom-defined SchemasSchemas

Data-oriented (e.g., Price, Data-oriented (e.g., Price, Invoice)Invoice)

Represents the business Represents the business information stored in the information stored in the documentdocument

Enable System IntegrationEnable System Integration

XML Reference SchemasXML Reference SchemasDisplay-oriented (e.g. Bold, Display-oriented (e.g. Bold, Italics, Tables, Paragraphs, Italics, Tables, Paragraphs, Styles)Styles)

Open Document FormatOpen Document Format

Enable Archival & File Enable Archival & File Formats InteroperabilityFormats Interoperability

Page 16: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas

XML Reference SchemasXML Reference SchemasDisplay-oriented (e.g. Bold, Display-oriented (e.g. Bold, Italics, Tables, Paragraphs, Italics, Tables, Paragraphs, Styles)Styles)

Open Document FormatOpen Document Format

Enable Archival & File Enable Archival & File Formats InteroperabilityFormats Interoperability

<w:p><w:p> <w:r><w:r> <w:rPr><w:rPr><w:b /><w:b /></w:rPr></w:rPr> <w:t>John Doe</w:t><w:t>John Doe</w:t> </w:r></w:r> <w:r><w:r> <w:rPr><w:rPr><w:i /><w:i /></w:rPr></w:rPr> <w:t>Health Agency</w:t><w:t>Health Agency</w:t> </w:r></w:r></w:p></w:p>

Page 17: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

<ConferenceReport> <Date>3/24/2004</Date> <Attendees> <Attendee Name=“John Doe”> <Department>

Health Agency </Department> <Potential> <Sales>100</Sales> <Growth>25%</Growth> … </Attendee>

Custom-defined Custom-defined SchemasSchemas

Data-oriented (e.g., Price, Data-oriented (e.g., Price, Invoice)Invoice)

Represents the business Represents the business information stored in the information stored in the documentdocument

Enable System IntegrationEnable System Integration

The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas

Page 18: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Custom Defined Custom Defined SchemaSchema

Page 19: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Developing With The Developing With The FormatsFormats

More Reliable SolutionsMore Reliable SolutionsThird-party tools were main cause of document corruptionsThird-party tools were main cause of document corruptions

Fully Documented FormatsFully Documented FormatsFreely available for download with a royalty free licenseFreely available for download with a royalty free licenseOffice file format schemas - Office file format schemas - Used to validate content for a given part Used to validate content for a given part

Samples, samples, samples Samples, samples, samples In the form of code “snippets” for easier use and integration into your VSTO In the form of code “snippets” for easier use and integration into your VSTO solutions solutions

WinFx Package APIs WinFx Package APIs Access/maintain parts and relationships within a fileAccess/maintain parts and relationships within a fileTakes care of all ZIP level functionalityTakes care of all ZIP level functionality

XPath XPath Navigation within content Navigation within content

XML DOM XML DOM Manipulating content Manipulating content

Office Open XML Resource KitOffice Open XML Resource KitTools for constructing and deconstructing the new file formatsTools for constructing and deconstructing the new file formatsDesign time Validation toolDesign time Validation tool

Parses a file and reports on schema, relationship errors and warnings Parses a file and reports on schema, relationship errors and warnings

Runtime serialization tool Runtime serialization tool Flattens package into a single file for ease of development in simple construction scenarios Flattens package into a single file for ease of development in simple construction scenarios

Page 20: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Real World ExampleReal World ExampleSchema documentationSchema documentation

Initial submission to Ecma International Initial submission to Ecma International TC45 was about 2500 pagesTC45 was about 2500 pages

~25 namespaces~25 namespaces~94 XSD files~94 XSD files622 simple types622 simple types1444 complex types1444 complex types3299 attributes3299 attributes3371 enumerations3371 enumerations3464 elements3464 elements

Obviously, there is a huge amount of Obviously, there is a huge amount of documentation to be donedocumentation to be done

Page 21: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Sample Solution ScenariosSample Solution Scenarios

Data interoperabilityData interoperability

Content manipulationContent manipulation

Content sharing and reuseContent sharing and reuse

Document assemblyDocument assembly

Document securityDocument security

Managing sensitive informationManaging sensitive information

Document stylingDocument styling

Document profiling Document profiling

Page 22: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

ResourcesResources

Office Preview Site http://www.microsoft.com/office/preview/

Brian Jones’s Blog http://blogs.msdn.com/Brian_Jones/

Kevin Boske’s Blog http://blogs.msdn.com/KevinBoske/

Office 2003 Reference Schema Information http://www.microsoft.com/office/xml/

Page 23: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

© 2006 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Page 24: Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy

Demo TitleDemo Title

NameNameTitleTitleGroupGroup