the next leap in javascript performance - html5 devconf haghighat... · the next leap in javascript...
TRANSCRIPT
The Next Leap in JavaScript Performance Mohammad Reza Haghighat
Senior Principal Engineer, Intel Corporation
October 20, 2014
• HTML5 - The New Lingua Franca?
• Exposing the full power of modern hardware to JavaScript*
• Bringing Perceptual Computing to the web platform
• Supporting JavaScript programming in Internet of Things (IoT)
• Summary
Agenda
2
HTML5 – The New Lingua Franca?
Native code PC spiral
1991
APPS .exe
2001
WEB HTML, Flash*
Web – “Write once, run on any browser”
2009
APPS iOS*, Android*, Windows*
App Stores Walled Gardens
2015
WEB HTML5
“Write Once, Run Everywhere”
“New open standards created in the mobile era, such as HTML5, will win on mobile devices.” – Steve Jobs
“If you want to do something that is universal, no question, world is going HTML5.” – Steve Ballmer
“It looks to me like HTML5 will eventually become a way almost all applications are built, including those on new phones.” – Eric Schmidt
3
Web: The Ubiquitous Software Platform
and the Application Model of the Future
Big Data Rich Capabilities
& Content
Social Contextual
Crowdsourced Sensors “Things”
4
• HTML5 - The New Lingua Franca?
• Exposing the full power of modern hardware to JavaScript*
• Bringing Perceptual Computing to the web platform
• Supporting JavaScript programming in Internet of Things (IoT)
• Summary
Agenda
5
Achieving ~ 1.5x native running time via targeting asm.js†, a highly optimizable subset of JavaScript defined by Mozilla
Astounding JavaScript* Performance With asm.js
asm.js : a highly optimizable low-level subset of JavaScript
http://www.unrealengine.com/html5/
Over 1M lines of C/C++ code compiled to JavaScript* by
Mozilla* and Epic
Epic* Games Unreal Engine* 3
† Courtesy of Mozilla Alon Zakai & Luke Wagner (http://people.mozilla.org/~lwagner/gdc-pres/gdc-2014.html#/)
asm.js Emscripten
JavaScript*
web
LLVM Bitcode Very efficient code generated by Firefox* JIT
6
Modern processors utilize parallelism to deliver high performance within a constrained power budget
The March of Parallelism
2002 2006 2008 2012
32 nm Tock
2010 2011 2012 2013
22 nm Tick
22 nm Tock
Intel® Advanced Vector Extensions
AVX2 FMA and integer support
AVX 256-bit floating point
1X=128-bit Since 2001
Next Gen Intel® Xeon PhiTM
AVX-512 512-bit vectors 8X peak SIMD
operations per core over 4 generations
2X
2X
2X
7
Optimizing Web Runtimes for Parallelism
Web runtimes need to be parallel end-to-end
Parse + build DOM
JavaScript*
Layout Engine
Render
GPU: parallel
CPU: mainly single-threaded
35%
33%
21%
11%
Render 35%
Layout 33%
Other 21%
JS 11%
• HTML5 runtimes of today are not scalable with number of cores
• Need parallelism for both responsiveness and energy efficiency
8
Parallel Parsing and Compilation
Background JIT compilers now in Chrome*, Firefox, Internet Explorer*, Safari*
PESPMA 2009
Four threads for JavaScript* parsing and compilation
JS and GFX execution
Epic* Citadel* profile on Firefox*
43.6
16.6
12.8
6.7
6.4 6.2
4.6 2.2 0.9
Cycle Breakdown by Categories js::compile
gfx::compile
os::others
js::parse
js::others
browser::others
os::mem
js::jitted
gfx::exec
bootstrap launch 4 threads
1 thread
9
Layout Engine: a performance bottleneck
Mozilla* Firefox* Page-Load Tests
Zimbra* Collaboration Suite*
ul em {color:blue}
CSS rule matching ~33% of the layout
HotPar 2010
Browser layout engine is a bottleneck but amenable to parallelism
10
Layout Engine ~42% execution
Towards Parallelizing the Browser Layout Engine
Parallel JavaScript*
• Started at Intel Labs, now with Mozilla*
• Extends JavaScript* with a data-parallel API
• Designed for multi-core CPUs and GPUs
• Simple, portable, and secure
Array increment example:
A.map(function(a) {return a+1;});
A.mapPar(function(a) {return a+1;});
Sequential
Parallel Accelerated animation of 3D avatars: more characters and more realism
Parallel JavaScript goal is to enable data-parallelism in web applications
11
SIMD – Single Instruction, Multiple Data
SIMD operations deliver great performance & power efficiency
Scalar Operation
Cx
Cy
Cz
Cw
=
=
=
=
Ax
Ay
Az
Aw
Bx
By
Bz
Bw
+
+
+
+
Cx
Cy
Cz
Cw
Ax
Ay
Az
Aw
Bx
By
Bz
Bw
+ =
SIMD Operation of Vector Length 4
Intel® Architecture currently has SIMD operations of vector length 4, 8, 16
12
SIMD - A Gap Between JavaScript* and Native
SIMD in JavaScript further reduces the performance gap Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
A Google*/Intel/Mozilla* ECMA TC39 Joint Project
• Bugzilla*: https://bugzilla.mozilla.org/show_bug.cgi?id=894105
• John McCutchan’s strawman proposal: http://wiki.ecmascript.org/doku.php?id=strawman:simd_number
C++ code for list average
“Proposed” JavaScript* code
SIMD code by ICC
13
SIMD.JS – The API
† Initial support for float32x4 and int32x4
Our SIMD prototype delivers 3x~4x Mandelbrot speedup†
Our Firefox* Prototype
14
Demo: Combining SIMD and Higher-Level Parallelism
SIMD speedup is nicely multiplied by WebWorkers†
† Source: Intel® Peter Jensen : https://github.com/PeterJensen/mandelbrot
WW: Number of WebWorkers
Our Chromium* Prototype
15
SIMD Speedups on our Chromium* Prototype
3.2 3.6 3.8 3.9
4.6 5.0
6.0
9.5
3.2 3.8
3.4
6.1 6.5
5.0 5.6
11.8
6.8
3.1 2.7
4.5 4.2 3.8
5.4
9.3
0
2
4
6
8
10
12
14
Transpose4x4 AOBench Mandelbrot MatrixMultiplication VertexTransform Average ShiftRows Matrix4x4Inverse
SIMD x-times faster than non-SIMD
3rd Generation Intel® Core™ i7 processor (3667U)@ 2.00 GHz, 32-bit, Ubuntu* 13 3rd Generation Intel® Core™ i7 processor (3667U)@ 2.00 GHz, 64-bit, Ubuntu* 13 Intel® Atom™ processor Z3770 @ 1.46GHz, Android* 4.4
Excellent early results while still focused on functionality
Theoretical speedup limit is 4
SIMD.JS benchmarks: https://github.com/johnmccutchan/ecmascript_simd/tree/master/src/benchmarks 16
SIMD.JS Proposal and Polyfill API SIMD Number (Google’s John McCutchan & Intel’s Peter Jensen): http://wiki.ecmascript.org/doku.php?id=strawman:simd_number
Polyfill API: https://github.com/johnmccutchan/ecmascript_simd
float32x4, int32x4, Float32x4Array, Int32x4Array
Constructors: float32x4(x,y,z,w) float32x4.zero() float32x4.splat(s)
Operations: abs, neg, add, sub, mul, div, clamp, min, max, reciprocal, reciprocalSqrt, scale, sqrt, shuffle, shuffleMix, withX, withY, withZ, withW, lessThan, lessThanOrEqual, equal, notEqual, greaterThanOrEqual, greaterThan, bitsToInt32x4, toInt32x4, …
The joint Google*/Intel/Mozilla* SIMD.JS proposal was approved to advance to the next stage of ECMAScript* TC39 standardization stage†
† A copy of the TC39 Presentation: http://esdiscuss.org/notes/2014-07/simd-128-tc39.pdf 17
Emscripten now targets SIMD.JS
Emscripten generates SIMD.JS from C++ SIMD intrinsics & auto-vectorized code
Near-native SIMD.JS speedup
C/C++ JavaScript*
1.00
2.03
7.18 8.13
0
2
4
6
8
10
Speedup over Scalar JS
Scalar JS Scalar C++
SIMD JS SIMD C++
18
Crosswalk in Brief
Application Runtime
Follow us at @xwalk_project
crosswalk-project.org
Open Source, using Blink* & Chromium*
Today on Android* and Tizen*
Easy addition of extensible APIs
Easy access to device APIs
Intel® platform capabilities
Latest HTML5 features in packaged web apps
Focuses on security, performance and standards compliance
Based on web technologies: HTML5, CSS3, JavaScript*
Updated & released to the latest Chromium every 6 weeks
19
Intel® XDK – Cross-platform Development Kit
Develop, debug, profile, and build responsive web & hybrid apps
Free at http://xdk.intel.com
Remote debugging & profiling
20
• HTML5 - The New Lingua Franca?
• Exposing the full power of modern hardware to JavaScript*
• Bringing Perceptual Computing to the web platform
• Supporting JavaScript programming in Internet of Things (IoT)
• Summary
Agenda
21
Toward Perceptual Computing†
Devices sense & perceive user actions in a natural & intuitive way † Source: Intel® Perceptual Computing SDK: www.intel.com/software/perceptual
Speech Recognition
Close-Range Tracking
Gesture Recognition
2D/ 3D Object Tracking
Facial Analysis
22
Reinventing Everyday Usages
Perceptual Computing opens up new dimensions in interacting with machine
Learning & Education 3D Scanning and Sharing
Scan it
Share it Customize & Print it
Immersive Collaboration
Gaming Out-of-reach Device Input
23
Demos: Media Capture Depth Stream Extension†
† Source: Intel® Ningxin Hu: https://github.com/huningxin/depth_stream_examples
WebRTC Google* Code: http://webrtc.googlecode.com/svn/trunk/samples/js/demos/html/
Magic Xylophone: Soundstep*.com: http://www.soundstep.com/blog/experiments/jsdetection/ 24
Enabling 3D Camera on Web Platform
3D Camera
• Beyond color: additional per-pixel distance
• Intel® RealSense™ on PC & tablets soon
Applications
• Real-time hand/finger/object tracking
• 3D scanning
• Video conferencing
Depth on Web Platform†
• Media Capture Depth Stream Extension
• Rendering & post-processing: <video>, <canvas>, WebGL* and SIMD.JS
• Streaming: transmit as MediaStream via WebRTC RTCPeerConnection
† Source: Intel® Ningxin Hu: https://github.com/huningxin/depth_stream_examples 25
Proposed Media Capture Depth Stream Extension†
† Source: http://w3c.github.io/mediacapture-depth/
Web Application
Browser or HTML5 runtime
RGB Stream
Depth Stream
getUserMedia (WebRTC) API
26
Gaming
Wireless Display for the Web
Unlock exciting new user experiences in HTML5
Presentation
† Big Buck Bunny video: http://www.bigbuckbunny.org/
Media Sharing/Casting†
27
• Connects web content to screens around you
• Hides display connection technologies from the developer
• Apple* AirPlay*, Microsoft* PlayTo*,
Google* Chromecast*, Miracast*, Intel® Widi
• Simple, high level API, easy to use
http://webscreens.github.io/presentation-api/
HTML5 Presentation API Proposal†
† Source: Intel® Dominik Röttsches
New standards-based feature for the cross-platform web
28
• HTML5 - The New Lingua Franca?
• Exposing the full power of modern hardware to JavaScript*
• Bringing Perceptual Computing to the web platform
• Supporting JavaScript programming in Internet of Things (IoT)
• Summary
Agenda
29
Intel® XDK IoT Edition
Companion Apps
Streamlined Workflow Design, Test, and Build Tools
• Quick start samples and templates
• Built-in editor and emulators
• UI Frameworks and Apache Cordova* APIs
• Test and debug tools
• Integration with Cloud Services APIs Design and build cross-platform companion apps easily for Android*, iOS*, and Windows*
30
Intel® XDK IoT Edition
JavaScript* apps on IoT devices
Integrated Development Environment Create, Debug, and Run Tools
• JavaScript allows easy on-board app development and deployment for many IoT devices
• Use JavaScript to define behavior of IoT device
• Deploy, run, debug on IoT device with JavaScript
• Integration with cloud, web services, and sensors through JavaScript APIs
IoT Device
Edit JavaScript app
Send app to device
Run app remotely
Remote debug
Development Platform
Development System
31
Internet of Things (IoT) Device (Intel® Galileo):
• PWM Led Controller on I2C bus
• RGB Led
• Node.js with Socket.io server
HTML App (Lenovo* K900):
• Socket.io connection to IoT device
• Change lighting color
• Cordova* App
Both made using:
Demo: Programming Internet of Things using Intel® XDK IoT Edition
† Source: Intel® Dan Yocom: http://xdk-software.intel.com/iot_edition_demo_video
RGB Lighting† Intel® XDK IoT Edition
32
• HTML5 - The New Lingua Franca?
• Exposing the full power of modern hardware to JavaScript*
• Bringing Perceptual Computing to the web platform
• Supporting JavaScript programming in Internet of Things (IoT)
• Summary
Agenda
33
• HTML5 is closing the gaps with native models
• SIMD in JavaScript* enables a large new class of high-performance apps
• JavaScript is about to get a lot faster for such domains as gaming
• Depth Camera support in HTML5 WebRTC enables exciting use cases
• JavaScript is proliferating rapidly in Internet of Things
• Intel® XDK supports end-to-end programming for Internet of Things
• HTML5 is the application model of the future
Summary
34
Web: The Ubiquitous Software Platform
and the Application Model of the Future
Big Data Rich Capabilities
& Content
Social Contextual
Crowdsourced Sensors “Things”
35
Download Firefox* Nightly and experience† the benefits of SIMD.JS
Leverage the power of SIMD.JS through Intel® XDK and Crosswalk
Download Intel® XDK free at http://xdk.intel.com
Call to Action
† SIMD.JS demos: http://peterjensen.github.io/idf2014-simd 36
Intel® Developer Zone
• Free tools and code samples
• Technical articles, forums and tutorials
• Connect with Intel and industry experts
• Get development support
• Build relationships
Tools. Knowledge. Community.
software.intel.com 37
Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Intel, Core, Atom, Xeon Phi, RealSense, Look Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
*Other names and brands may be claimed as the property of others. Copyright ©2014 Intel Corporation.
38